Duplicated Tags - Bad Form?

I have a company generating a daily XML feed I’ll have to parse. One of the first things I’ve noticed is there are 3 <categoryName> tags.

Am I correct in assuming this is bad form?

I’d be parsing it with either PHP’s DomDocument or SimpleXML and I’m not sure if these duplicate tags would break my parsing code.

An interesting sidenote, if any instances of the <categoryName> tag have no data to display, they’re generated as empty <categoryName/>. So perhaps they’re necessary for the software generating the feed?

I suppose I could target these elements based on where they’re placed in the XML tree.

Any other thoughts?

Yes that is indeed pretty bad form, especially if different nodes contain different values.

Indeed, you could check by position, and hope for the best. That’s what I would do as well.

Also, I’d store all the other values you get as well, even if you don’t do anything with them. Just so you have them and don’t have to come back later if you find you picked the wrong one. With the prices of storage nowadays that shouldn’t matter.

I’m guessing you are talking about something that looks like:

<categoryName>One</categoryName>
<categoryName>Two</categoryName>
<categoryName />

If so, that is often used to represent an array of values in XML. No real big issue with it IMHO. I could tell you how to handle it with C# and .NET XML Deserialization but that probably won’t help here.

Thank you both for your input. wwb_99, yes that’s how the tags are formatted. Interesting to hear there might be a legitimate reason for why duplicated tags are being used. I feel though someone at the other end of this xml file is taking the lazy approach. I’m sure it wouldn’t be much work to parse each of the array attributes into unique nodes. Though these nodes are the only children of their own parent node, so targeting them specifically should be pretty easy. I’ll just have to figure out how the PHP XML parsing libs do it.

The correct way to create arrays in XML is IMO:


<categories>
   <name>Category 1</name>
   <name>Category 2</name>
   <name />
</categories>

Just makes the most sense, semantically.

Then again, the best XML is no XML at all :smiley:

I’m a huge json fan. I find it makes life so much easier.
Of course, we don’t always get to pick and need to row with the peddles that are handed to us.

That definitely makes a little more sense but they are really six of one half dozen of the other. Same with json – it is nice and a little more readable, but the flip side is there are things one can do with xml that you can’t do with json.

In any case, remember one should never parse xml or json. They aren’t text but rather serialized representations of objects. So remember to deserialize them.