Results 1 to 1 of 1
Thread: String encoding, Unicode, UTF
Jan 17, 2010, 13:53 #1
- Join Date
- Jan 2010
- 0 Post(s)
- 0 Thread(s)
String encoding, Unicode, UTF
I then encode the string with this function
http ://phpjs .org/functions/utf8_encode:577
which says, it encodes ISO-8859-1 strings to UTF-8, but I wonder if my input strings are ISO-8859-1 (I think not, because the function seems to work for any unicode character).
Conclusion: Whenever I do an XMLHttpRequest I use this function to display the data or use the data to make other requests.
Now I want to extract some data from a document (e.g. document.getElementById("myid").innerHTML). The resulting string is already UTF-8 encoded, so re-encoding with my function would "destroy" the string.
Conclusion: Whenever I extract something from the document, I do not utf-8 encode the data.
Now I want to use JSON, and it begins to becoming a mess. In script 1, I load the raw JSON text, utf-8 encode it (everything is still fine), stringify parts of it, and add this to a script 2 that I dynamically embed into the page (by creating a new <script>-tag).
Now I get problems. For example, characters like "ñ" are now "Ã±" in script 2, while being displayed correctly in script 1 (looks like a double encoding to me, but why? I guess they are also automatically utf-8 encoded during the <script>-tag generation. Is this correct?).
I can solve this problem if I leave the utf-8-encoding-step in script 1 out. Then, everything works in script 2, but the encoding would be wrong in script 1. I think I could also solve this by decoding the strings before adding them to script 2, but I haven't tried that, yet.