Hi,
I have a question regarding strings and unicode in Javascript.

If I have a server-side script that produces UTF-8 encoded output (with proper headers, etc.) and I load this with XMLHttpRequest, I get a string that is not UTF-8 encoded. I think it is something like a javascript internal unicode encoding, because mystring.length returns the number of characters (e.g. if the response contains the string "ł", the length is 2).

I then encode the string with this function
http ://phpjs .org/functions/utf8_encode:577
which says, it encodes ISO-8859-1 strings to UTF-8, but I wonder if my input strings are ISO-8859-1 (I think not, because the function seems to work for any unicode character).
Conclusion: Whenever I do an XMLHttpRequest I use this function to display the data or use the data to make other requests.

Now I want to extract some data from a document (e.g. document.getElementById("myid").innerHTML). The resulting string is already UTF-8 encoded, so re-encoding with my function would "destroy" the string.
Conclusion: Whenever I extract something from the document, I do not utf-8 encode the data.

Now I want to use JSON, and it begins to becoming a mess. In script 1, I load the raw JSON text, utf-8 encode it (everything is still fine), stringify parts of it, and add this to a script 2 that I dynamically embed into the page (by creating a new <script>-tag).
Now I get problems. For example, characters like "" are now "ñ" in script 2, while being displayed correctly in script 1 (looks like a double encoding to me, but why? I guess they are also automatically utf-8 encoded during the <script>-tag generation. Is this correct?).
I can solve this problem if I leave the utf-8-encoding-step in script 1 out. Then, everything works in script 2, but the encoding would be wrong in script 1. I think I could also solve this by decoding the strings before adding them to script 2, but I haven't tried that, yet.

As you can see, I am a bit confused. I do not really know when a string is saved in javascripts internal unicode representation or when it is saved as an UTF-8 encoded byte-string. This always leads to a lot of trial & error coding when I develop a javascript app that handles unicode strings and has to do server i/o.

Is there any general rule, or some documentation, that clearly says, how javascript really handles strings?