Php outputting wrong foreign characters

I have a page that outputs hotel names from the database, and some of them have an ó and others have a á, and there a few more like that.

I tried to use str_replace as you will see in the code below but that didn’t work as it doesn’t recognise the character to replace and I don’t know what to put there for it to recognise to change to. I understand I can put UTF8 into this select statement, but not sure how to work it in.

$in=$_GET['txt'];
$sid=$_GET['sid'];

if(isset($_GET['radio'])) { 
$radioB=$_GET['radio'];
}

$msg="";
if(strlen($in)>0 and strlen($in) <20 ){
if ($radioB == "resort") {
$sql="select Nom_Hot, IdType_Hot, Id_Hot, IdRsrt_Hot, Dir_Hot, IdCat_Hot, Act_Hot from tbl_hotels LEFT JOIN tbl_resorts ON (tbl_resorts.Id_Rsrt=tbl_hotels.IdRsrt_Hot) where tbl_resorts.Nom_Rsrt like '%$in%' AND Act_Hot=1";
} elseif ($radioB == "hotel") {
$sql="select Nom_Hot, IdType_Hot, Id_Hot, IdRsrt_Hot, Dir_Hot, IdCat_Hot, Act_Hot from tbl_hotels where Nom_Hot like '%$in%' AND Act_Hot=1";
}
foreach ($dbo->query($sql) as $nt) {

$nt['Nom_Hot'] = str_replace("ó", "&oacute;", $nt['Nom_Hot']);
$nt['Nom_Hot'] = str_replace("á", "&aacute;", $nt['Nom_Hot']);

$msg2="";
$int=$nt['IdCat_Hot'];
if (in_array($nt['IdCat_Hot'], array(6, 7))) {
} else {
if($int>0) { $k=0; while($k<$int) {$msg2.="<img src='site_images/orange_Star_Transparent.png' width='11' height='10' style='vertical-align:1px;' alt='gold star' />"; $k++; } } }

$msg .="<div style='position:relative; width:100%; height:auto; clear:both; margin-bottom:3px; margin-left:5px;'><a href='hotel.php?hotel_ID=$nt[Id_Hot]&amp;Type=$nt[IdType_Hot]&amp;Resort=$nt[IdRsrt_Hot]' title='$nt[Nom_Hot]' class='result_Link' style='width:100%; clear:both;'><span style='text-decoration:underline; color:#E17411'>$nt[Nom_Hot]</span> <img src='images/address_Icon_2.png' height='17px' style='vertical-align:-3px;' /> $nt[Dir_Hot] &nbsp;$msg2</a></div>";
}}
echo $msg;
?>

I take it your html is set to utf-8. Just ensure that when you save your php-scripts/html-templates that the file is saved using utf-8 encoding. Some editors will default to saving with other encoding, it’s easily overlooked.

OK I can certainly go back and have a look at that, and yes the page itself is set to UTF8, but the problem I got is there are thousands of hotels in the database and going through each one and changing all the problems is not feasible really, so I was hoping that I could integrate UTF8 into the select statement to cover the hotels already in the database.

If it is, then it should have no problem displaying the accented characters. That is unless the file is saved with another encoding, in which case the browser displays some symbol.
So it’s a case of re-saving files.


That’s notepad, which you are probably not using, but just as an example.

I just checked it and exactly as your showing it the Encoding is set.

Other outputs on the page are displaying correctly, just this particular output isn’t, its for some reason not doing what it should be

i had a similar problem, for some reason str_replace didn’t want to do it and ended up with this which worked.

$row = preg_replace('/\p{Zs}/u', ' ', $row);
        $row = str_ireplace(array('<p>','</p>','‘','&nbsp;','<br>','<br />','<Br>','<div>','</div>',"’",'é','£','&pound;','&rsquo;','&lsquo;'),array('','',"'",'','','','','','',"'",'&eacute;','GBP','GBP','',''),$row);
        $row = preg_replace('/(\v|\s)+/', ' ', $row);

i had to add the first preg_replace to change whitespaces and it then worked. The str_replace can accept arrays which is what i’ve done above. You just need the array of what you are looking for and in the second array what you are changing too.

Does that work for you?

Hi noppy,

I think I got you, so tried this

$nt['Nom_Hot'] = preg_replace('/\p{Zs}/u', ' ',$nt['Nom_Hot']);
$nt['Nom_Hot'] = str_ireplace(array('ó','á'),array('&oacute;','&aacute;'),$nt['Nom_Hot']);
$nt['Nom_Hot'] = preg_replace('/(\v|\s)+/', ' ', $nt['Nom_Hot']);

Which basically resulted in the hotel name not showing now, its mostly ó and á that’s causing the problems, and instead they show a square. I can actually show you.

In the link below there a yellow search box on the right, if you click the Hotel radio button and start typing Barcelo, you will see the hotels appear but they squares instead of the correct characters.

And that’s with me reverting it back to my original code

The first hotel in that list should have the name - Barceló Bávaro Beach

Link

What happens if you use the single line version you have mixed with my code

$nt['Nom_Hot'] = preg_replace('/\p{Zs}/u', ' ',$nt['Nom_Hot']);
$nt['Nom_Hot'] = str_replace("á", "&aacute;", $nt['Nom_Hot']);
$nt['Nom_Hot'] = preg_replace('/(\v|\s)+/', ' ', $nt['Nom_Hot']);

Ye its strange what happens when I use your code, as you start typing Barceló again, you see some of the hotel names appear, but the ones with these issues in, now don’t seem to show up at all.

hmm so i guess if someone is searching and it is using the character which we are forcing then whatever is in the database is not being matched so not returning anything.

I think there is a way of outputting/converting specific character sets on the query which might then allow it to match. Hopefully i am not barking up the wrong tree.

are you sure you have the encoding right in the database?

Hi felgall,

This is the Nom_Hot field

Nom_Hot  varchar(200)  utf8_general_ci 

I guessed that was correct

I remember having trouble with some of our data with certain characters which were almost impossible to detect. It was a full stop but was the ‘wrong’ full stop. Took ages to identify and was pretty hard to do a find and replace.

We tested a few different settings and changed to utf8_bin and it seemed to work for half of our problems (we had problems with some other characters too).

It was a bit confusing as we’d set our pages to UTF8 and the database to utf8_general_ci but we were still getting the issues. Hence the above code to try and weed out the problems.

For us it was breaking the Json encode function. It would output to the website without a problem.

It’s not ideal to have to fix it on the fly but for us it was the only way at the time. The above code seemed to do that job for us.

It’s strange and very annoying, so have reverted it back to how I had it as at least they can then see the name of the hotel even though there a flipping square in the middle of some of them.

So its back to for the time being -

$nt['Nom_Hot'] = str_replace("ó", "&oacute;", $nt['Nom_Hot']);
$nt['Nom_Hot'] = str_replace("á", "&aacute;", $nt['Nom_Hot']);

how about trying http://dev.mysql.com/doc/refman/5.0/en/charset-convert.html

SELECT column1, CONVERT(column2 USING utf8)

Credit - http://stackoverflow.com/questions/16051369/convert-output-of-mysql-query-to-utf8

Hi Noppy,

OK will give that a go. It’s mad that something like this is not being fixed and is strangely unique, when it seems a simple thing to fix, or a simple error has been made.

This link may be relevant, I used PHP transliterate(…)

http://www.johns-jokes.com/downloads/sp-e/jb-ajax-search/cities/

I am using a tablet at the moment and unable to test.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.