Decoded quotes in javascript?

Hi!

I’m using jquery to send the html in the body tag of a page via Ajax to a MySQL database. The special characters are encoded in the HTML. Doing a check of the content before sending shows me correctly encoded special chars.

But when I check the $(‘body’).html() just before sending to the AJAX function, single and double quotes are decoded. So you see that when the HTML is sent in the DB, it often cuts up the code at a quote and can’t send all the code to the DB.

Is there something in JS that decodes single and double quotes automatically? Or maybe it’s jQuery? Thanks for any help!

Per the HTML spec, browsers automatically decode HTML entities in a URL when sending a request. If you’re using the GET method, then your data is part of the query string, hence part of the URL. Try base64 encoding the content before you send it. There are free JavaScript encoding libraries for this; just Google one.

Thanks for the info, but that’s not quite it.

You see, I didn’t even send the data to the AJAX routine yet. I’ve done that via a $.ajax method of jQuery, with a POST parameter to send as POST data and it works fine.

I’m just checking encoding of the value just before sending but everything is special chars encoded except the quotes. I’m not sure where the decode is coming from.

The steps I use before sending the HTML code to the DB (via jQuery AJAX) is :

  • use of FCKeditor to edit a div content inline.
  • check the saved content of the the div that was edited and all is encoded perfectly, including single/double quotes.
  • I then use the jQuery command $(‘body’).html() to grab the whole HTML of the body of the page.
  • Check the $(‘body’).html() and all is encoded ->except<- quotes (single and double).

So could it be jQuery decoding just de quotes when using $(‘body’).html() to get the HTML code? Thanks!

What those JavaScript rich text editors all have in common is, they’re based on browser editing features that seldom pass up an opportunity to modify your code to suit Mozilla’s or Microsoft’s coding preferences. I can almost swear without seeing it that that’s what is happening here.

I’ve removed FCKeditor from the equation just to be sure it didn’t interfere. I’ve hardcoded the text in a div like so :


<div id="test1">
<p>Lo&quot;rem ip'sum do&amp;lor sit a<me>t, consectetuer adipiscing elit. Curabitur et felis at dolor egestas vehicula. Nullam tortor tellus, rhoncus non, sodales at, venenatis sed, nunc.</p>
</div>

(note that the single quote is really & #039; in the code, the board just can’t show it) I then just use : alert($(‘#test1’).html());

… and it show me the following :


<div id="test1">
<p>Lo"rem ip'sum do&amp;lor sit a<me>t, consectetuer adipiscing elit. Curabitur et felis at dolor egestas vehicula. Nullam tortor tellus, rhoncus non, sodales at, venenatis sed, nunc.</p>
</div>

As you can see, the ampersand and greater/less than are kept encoded. But the single/double quotes have been decoded! Any ideas? Thanks again!

Does the same thing happen if you do this? It will tell you whether jQuery is decoding them or not.


alert(document.getElementById('test1').innerHTML);

Great thinking! And, yep, it also decodes the quotes with this command. So, the next question is : why does javascript decode single/double quotes and how can I bypass this?

I’m a bit puzzled since I’m sending HTML code to the DB and can’t re-encode single/double quotes at large. Some of those single/double quotes are needed, just not the ones from the actual text.

I could escape everything and unescape when it comes back from the DB but I’d rather have cleanly encoded HTML from the start. Thanks!

Why do you use entities for quotes in your HTML in the first place? It’s not necessary unless they’re inside an element attribute’s value.

Why use entities? Because entities are one of the correct way to do. It removes the risk of the quote inadvertantly being used, and it allows you to use the right entities instead.

The ’ character is actually the ″ entity, used to indicating feet, whereas " is the ′ entity for indicating inches.

A double quote is " but really the proper quote should be used instead. “ for a left double quote and ” for a right double quote.

When using an apostrophe, that should be ’ for a right single quote.

It’s called markup, and is what the M in HTML is all about.