I have set up a database using collation utf8_bin and all the fields are utf8_bin. When I input Hebrew text on the mySQL database it appears fine on phpMyAdmin but when I try and retrieve this text and echo it on my php site it just comes out as a series of ???. What can I do about this? I searched for answers and many people have asked the same question as me on different PHP forums but I am yet to find a solution.
I think the problem is that all my entries are being stored as blobs in stead of text despite the fact that I have set the type to be mediumtext.
I contacted the company that hosts my site and they have no idea why the mySQL is behaving like this. They suggest that I try inserting the data into the mySQL a different way, but I have already tried entering it directly via phpmMyAdmin and also via a php site.
I managed to stop the mediumtext turning into BLOBs by changing the mySQL collation to utf8_unicode_ci and reimporting my tables (the collation was utf8_bin originally). But STILL all the Hebrew text is rendered as ??? when echoed onto my PHP page.
Mismatch between Public and System identifiers in the DOCTYPE declaration
This document uses an inconsistent DOCTYPE declaration. The Public Identifier -//W3C//DTD HTML 4.0 Transitional//EN declares the HTML 4.0 Transitional document type, but the associated System Identifier http://www.w3.org/TR/REC-html40/strict.dtd does not match this document type.
The recommended System Identifier for HTML 4.0 Transitional is http://www.w3.org/TR/1998/REC-html40-19980424/loose.dtd.
The safest way to use a correct DOCTYPE declaration is to copy and paste one from the recommended list and avoid editing that part of your markup by hand.
You need to correct the doctype, for HTML strict it’s
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
for HTML transistional it’s
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
The broken doctype might be confusing web browsers.
Hi SpacePhoenix, thank you for our advice. My site is now XHTML valid.
I managed to get PHP to display Hebrew retrieved from mySQL correctly. My web server is set to ‘character_set_server: latin’. I cannot change this as I am on shared hosting so I used the PHP function ‘mysql_set_charset’ to set a utf8 collation for the viewer of my site’s connection. I have put the following code on each page with Hebrew text after the </html> tag:
<?php
$link = mysql_connect('mysql2.steadfast.net', 'goatswi_leo', 'password') or die(mysql_error());
mysql_set_charset('utf8',$link);
?>
Hi Mr Space Phoenix, it works now but my only concern is that I don’t really understand why the script works if I put at the end of a page after the </html> tag but if I put it anywhere else it just renders the Hebrew as gibberish. See this example where I put the script just before the </head> tag. Is there any logic to this? As it stands I can leave it after the </html> tag but I would prefer to understand how the script works a bit better.
I have now taken out all the mysql_set_charset functions from all the pages and the Hebrew still displays fine. I don’t get it, it only started working once I began using this script and now I’ve taken it out it works??? However when I try and add the Hebrew text directly onto the database via myPHPAdmin it displays the text as gibberish. However, if I do it via a PHP CMS form I made it works. But I was using this form from the start so it doesn’t explain why the Hebrew works suddenly displays now and not before. It’s great but a little concerning.
Possibly the hosting company might have tweaked something on the server. What CMS are you using?
Edit: I just tried copying and pasting a sample of the hebrew into phpmyadmin, it shows it as a load of ? whilst in a test setup of phpbb it shows the hebrew, so possibly phpmyadmin is not setup to display and process hebrew
What I’ve noticed is if I try and put the Hebrew text directly onto the mySQL database via phpMyAdmin the Hebrew appears fine on phpMyAdmin but as a lot of ??? on my website. Conversely if I upload the text via my simple PHP CMS the text appears as gibberish ◊ô◊¢◊ô◊®◊ on phpMyAdmin but as Hebrew on my website. Strange? Maybe my hosting company did change something but it seems doubtful as I was in contact with them and they seem equally baffled.
Thanks again mr Phoenix,
Thank you for all the time you have put into this!
After a little playing with the charatcer encoding for the field in a test, i changed it to utf8_bin and it displays the hebrew ok. By adding
mysql_set_charset('utf8',$link_id);
to right after I connect to the db.
It now displays properly on a test page. It seems like unless you tell php explicitly to use utf-8 it will revert back to whatever it’s default is. I read up on it a bit more and it’s possible to change php’s default encoding directive to utf-8 but as your on a shared hosting, chances are that you’ll have not control over what the directive is set for.