SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Member
    Join Date
    Jul 2013
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    vbulletin character encoding problems

    Look at http://otakuhelpers.com/#news_2 under "entertainment news". I have a problem with certain characters and Japanese characters showing up like:

    "& #8220;"/"& #8217;"/ect... (no white space) are unconverted. Vbulletin causes the page to be sent as ISO-8859-1 when the page is not cached as it's imported,

    How do I make vb send pages UTF-8?

  2. #2
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,068
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    What charset/character encoding is vBulletin currently set to use when interacting with it's database?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  3. #3
    SitePoint Member
    Join Date
    Jul 2013
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do I check that? I think it's latin1_swedish_ci ?

    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> is what it sends to the browser.

  4. #4
    SitePoint Wizard
    Join Date
    Oct 2005
    Posts
    1,849
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)
    I viewed your site using Firefox 24.

    The Firefox Web Developer Inspector shows your Content-Type header is screwed up. You are sending:

    Code:
    Content-Type: text/html; charset: utf-8;charset=utf-8
    Should be:

    Code:
    Content-Type: text/html; charset=utf-8
    http://www.w3.org/International/ques...access-charset

    Secondly, your HTML entities are screwed up. Your HTML source shows:

    Code:
    &amp;#8211;
    The entity number should begin with an ampersand (& character) not an entity name (&amp; ).

    Are you running htmlspecialchars() or htmlentities() on the ampersand character when you should not be? Your PHP could be taking the & character in the HTML entity ID and converting it into the entity name &amp;.

  5. #5
    SitePoint Member
    Join Date
    Jul 2013
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for pointing out the
    Code:
    &amp;
    I had a line to convert & to that in the core of my script to make the page validate and didn't notice it did that.

    1 problem fixed!

    However I don't know how to fix the character encoding issue. I think it's the way I include the vbulletin global.php file when the page is not cached, but I'm not sure.

    PHP Code:
    header('content-type: text/html; charset: utf-8');
    ini_set('default_charset''utf-8'); 
    is at the top of the global.php file of my script (included at the start of every page)

    Code:
    <FilesMatch "\.(htm|html|css|js|php)$">
       AddDefaultCharset UTF-8
       DefaultLanguage en-US
    </FilesMatch>
    is in my .htaccess

  6. #6
    SitePoint Wizard
    Join Date
    Oct 2005
    Posts
    1,849
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Nination View Post
    PHP Code:
    header('content-type: text/html; charset: utf-8');
    ini_set('default_charset''utf-8'); 
    is at the top of the global.php file of my script (included at the start of every page)

    Code:
    <FilesMatch "\.(htm|html|css|js|php)$">
       AddDefaultCharset UTF-8
       DefaultLanguage en-US
    </FilesMatch>
    is in my .htaccess
    First off, this line:

    Code:
    header('content-type: text/html; charset: utf-8');
    Appears to me to be wrong. An equal sign should be used, not a colon.

    Code:
    charset=utf-8
    Is that your code above? I don't see how vBulletin would get that wrong. For now, get rid of the UTF-8 line in your htaccess file. Comment out the line by placing a # sign before the line. Then debug from there. My guess is that vBulletin is sending a UTF-8 header meaning you don't have to. Once you get this bug fixed then you can uncomment the line in htaccess and see if everything still works and continue using it if there is no bad effect. Right now you need to simplify to narrow down the problem.

  7. #7
    SitePoint Member
    Join Date
    Jul 2013
    Posts
    12
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by cheesedude View Post
    First off, this line:

    Code:
    header('content-type: text/html; charset: utf-8');
    Appears to me to be wrong. An equal sign should be used, not a colon.

    Code:
    charset=utf-8
    Is that your code above? I don't see how vBulletin would get that wrong. For now, get rid of the UTF-8 line in your htaccess file. Comment out the line by placing a # sign before the line. Then debug from there. My guess is that vBulletin is sending a UTF-8 header meaning you don't have to. Once you get this bug fixed then you can uncomment the line in htaccess and see if everything still works and continue using it if there is no bad effect. Right now you need to simplify to narrow down the problem.
    That is my code, not vbulletins. I changed the colon to an equal sign and commented out the code in the .htaccess. I think it works now, but I'm not sure. I know it works sometimes when the page is cached, but it does not work correctly when vbulletin is loaded (every 30 seconds depending on if you are the first user or not).

    I have to modify something in vbulletin to make it stop doing that to the page when vbulletin is loaded.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •