SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Evangelist winterheat's Avatar
    Join Date
    Aug 2007
    Posts
    508
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    javascript to handle all unicode in AJAX

    inside of javascript, I am using

    url = "http://www.website.com/foo.php?q=" + escape(getElementById("keyword").value)

    so as to do an AJAX query...

    all goes well until I tried using Chinese characters and French characters.

    For example, if I enter 和記 into the search box, then that part of the url becomes foo.php?q=%u548C% u8A18 (no space between % and u, but i need to enter a space so that the webpage won't goof up)

    this format % u548C is quite rarely seen... as opposed to %E5%92%8C%E8%A8%98 which usually the browser generates as UTF-8. And it is different from 和 which is for html also.

    the question is, is encodeURI() or encodeURIComponent() supposed to be used instead of escape() in javascript.

    If i use it, then I get the UTF-8 and it works well in PHP. Otherwise, I will need to convert % u548C back to UTF-8 before processing it.

    The same goes for French, if the word véritable is entered into the search box, then escape() will give v%E9ritable while encodeURI() or encodeURIComponent() will give v%C3%A9ritable which is UTF-8, and it works well since it is consistently UTF-8 (no need to deal with % u548C or %E9 as two different cases) (the final task is actually to get info from youtube using that search string, and must work well when the search string has space, single quote, double quote, and international characters).

    So encodeURI() or encodeURIComponent() is the one to use? in what case do we use escape() then? there doesn't seem to be a PHP unescape function that works well with javascript's escape() also.

    Thanks!

  2. #2
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    You've already discovered the solution, yes.

    escape() is only safe for characters in the ASCII range

    encodeURI and encodeURIComponent are safe for UTF-8

  3. #3
    SitePoint Evangelist winterheat's Avatar
    Join Date
    Aug 2007
    Posts
    508
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i realize when i type

    HTML Code:
    don't say goodbye
    into a search box (and let's say, being submitted to the youtube search)

    then the URL will contain

    HTML Code:
    don%27t+say+goodbye
    but if I use encodeURI() or encodeURIComponent()

    then the return value is

    HTML Code:
    don't%20say%20goodbye
    so encodeURI() or encodeURIComponent() is not entirely the same as how the browser encodes a string into a URL. So is there a function that's identical to it? in PHP, there is urlencode() and rawurlencode() with the second one converting a space to "%20" while the first one converts to "+".

  4. #4
    Programming Since 1978 silver trophybronze trophy felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, NSW, Australia
    Posts
    16,875
    Mentioned
    25 Post(s)
    Tagged
    1 Thread(s)
    There is no built in method for doing that, you would need to build your own. A one line regular expression should be able to do it provided that you can organise all of the map from/to values.
    Stephen J Chapman

    javascriptexample.net, Book Reviews, follow me on Twitter
    HTML Help, CSS Help, JavaScript Help, PHP/mySQL Help, blog
    <input name="html5" type="text" required pattern="^$">


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •