Ampersand Confusion: "cannot generate system identifier for general entity"

I realise that I am meant to specify the ampersand as & and I have done exaclty that but it still comes up this the error.

I am specifying a list of links in XML and this is one of the links which the error report refers to:

<article name="Article 1" url=";p=4" color="green"/>

Anyway, as you can see, in the XML it is specified as & yet when I run the W3c validation check it still brings up the error.

When I look at the source code for the actual PHP page (which reads and displays the articles from the XML) it does not show & - instead it shows the actual symbol…

Something must be happening with the parsing of the XML…

Anyway - Does anyone know how to fix this issue so that it does not throw up an error???

Thank you :slight_smile:

You probably need to use a numerical entity reference here. I think it’s &#38; but I could be wrong.

“Article [space] 1” is not a valid value.

curses silently How did I miss that?

hangs head up in shame

How do you know that? Have you seen the DTD or XML Schema for this document? If the name attribute is defined as NAME or ID there are restrictions, but if it’s defined as CDATA the attribute value can contain any valid character.

The &amp; character entity is predefined in both HTML and XML, so it should always be safe to use. Using an NCR (&#38;#38;) as Dan suggests is also valid, but shouldn’t be necessary.

Tommy, vBulletin parses NCRs here, so you’ll have to type it out as &#38; as I did, but with the #38; part being duplicated.

I know, and I did. But then I edited the message and stupid vBulletin changed it back. :rolleyes:

Thanks for all your help…

Dan, I have tried that NCR - the problem persists though.

Also, the name attribute has been declared as CDATA - sorry that I did not mention that :slight_smile:

Anyway - I have solved the problem - although I have done in a really messy adhoc way.

I just replaced all instances of & with my own made up character replacement - I used [AMP]

Then - In the PHP which parses the XML I used str_replace on the URL attribute to replace all instances of [AMP] with &

$attribs['url'] = str_replace('[AMP]','&amp;',$attribs['url']); // <-- Replaces [AMP] with &amp;

It’s so odd that I have to do this for it to work!

You have to type it like this: &#38;#38; if you don’t want it to get parsed.

I know. That’s what I did. But then I edited the post, which means vBulletin changes it to &#38; in the textarea and when I saved that was interpreted.

I simply forgot to redo the escape when I edited the post. If you look at the post now, you’ll see that it’s okay again (since I re-edited).

Well, at least everyone else who reads this thread will know what to do from now on! :irock:

I wonder if something like what vBulletin does during parsing is also happening with the XML parser (something like needing to escape twice for a double pass?). If so, using &#38;#38; instead of the [AMP] replacement might work.

No… (Unless you’re dealing with entity references.)