SitePoint Sponsor

User Tag List

Results 1 to 11 of 11
  1. #1
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    special characters (Greek) utf-8 and url

    Hello all,

    I've made a CMS and having problems with Greek characters when displaying the url.
    Example:
    http://localhost/portal/cms/articles...6%CE%AE&id=165

    while in English is working fine:
    http://localhost/portal/cms/articles...sources&id=168

    I'm using Smarty and a plugin in order to pass the urlencode which works fine.
    The problem is just on the Greek characters. It shows %CE%A0 etc insteed of the problem Greek characters.

    Below is my link syntax under the .tpl file.
    <a href="articles/article.php?category_name={$category_name[article]|url_encode}&subcategory_name={$subname[article]|url_encode}&id={$article_id[article]|url_encode}">{t}Read more{/t}</a>

    Any ideas how I can bypass this problem?

    Many thanks

  2. #2
    SitePoint Zealot
    Join Date
    Jun 2007
    Posts
    150
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, the HTTP specification requires special characters to be encoded.

  3. #3
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    which means?
    Would you please elaborate a little bit more.

    Thanks

  4. #4
    SitePoint Addict
    Join Date
    Dec 2004
    Posts
    240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It must show %CE%A0 etc. instead of Greek characters in URL. Actually you have to urlencode() any values in the query string of your URL. I did not use Smarty, but I suppose that "|url_encode" in your code means exactly this - values urlencoding.

  5. #5
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes |url_encode means values urlencoding but again I missed your answer. What do you want to say? Do I need to keep the Greek characters as they are?
    Why this?

  6. #6
    SitePoint Addict
    Join Date
    Dec 2004
    Posts
    240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Any characters in your URL query string values must be urlencoded. If they were not, you would have to do it. Since they are urlencoded already, this is the correct behavior.

  7. #7
    Sesame Street Iimitk's Avatar
    Join Date
    Feb 2006
    Posts
    662
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Only a subset of ASCII characters are allowed in URLs. Non-ASCII characters must be escaped using the percent-escape method, which is also called URL encoding in programming. Every non-ASCII character is converted into its equivalent hexadecimal value, preceded by a percent (&#37 character. Check the Wikipedia article on Percent Encoding for a broader view of the topic.

    Note however, that IRIs (International Resource Identifiers) which allow for non-ASCII characters in URLs are gaining rapid support amongst browser vendors. Opera 9 and Firefox 3 are perfectly supporting IRIs, and IE7 claims that too but it never worked for me.
    Imagination is more important than knowledge. - Einstein

  8. #8
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thanks a lot for the explanation!

  9. #9
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I've used mod_rewrite to make SEO in url's but still got problem only with non English characters.

    In my .htaccess file I have:
    PHP Code:
    RewriteEngine on
    RewriteRule 
    ^/?cms/([a-zA-Z+\ _]+)/([a-zA-Z+\ _]+)/([a-zA-Z+\ _]+)$ /\~manos/portal/cms/articles/article.php?category_name=$1&subcategory_name=$2&title=$[L
    which gives me a url like:
    PHP Code:
    http://localhost/~manos/portal/cms/Company/Human+Resources/test+title 
    In my .tpl (Smarty template) file I wrote the link (read more...) like:
    PHP Code:
    <a href="{$category_name[article]|url_encode}/{$subname[article]|url_encode}/{$title[article]|url_encode}">{t}Read more{/t}</a
    Even though the above code works fine on English section, on non-English languages like Greek for example I'm getting a 404 error even though the link is displayed correctly when pointing the cursor on it.

    Any ideas?

    Thanks a lot

  10. #10
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    In case that helps, those errors are displayed in the apache error log file:

    File does not exist: /home/manos/public_html/portal/cms/\xce\xa0\xcf\x81\xce\xbf\xce\xb9\xcf\x8c\xce\xbd\xcf\x84\xce\xb1, referer: http://localhost/~manos/portal/cms/home.php

  11. #11
    SitePoint Enthusiast
    Join Date
    Jul 2006
    Posts
    97
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ok, I figure out the problem with the non English characters. Added %\w and problem solved but now insteed of symbol + it shows %2B. Any ideas?

    Here is my .htaccess:
    PHP Code:
    RewriteRule ^/?cms/([a-zA-Z+\ _%\w]+)/([a-zA-Z+\ _%\w]+)/([a-zA-Z+\ _%\w]+)$ /\~manos/portal/cms/articles/article.php?category_name=$1&subcategory_name=$2&title=$[L


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •