I am not sure if this is a database question but certain characters like apostrophe are getting replaced with question marks. What should I change in the database settings, if this is a database question!
getting replaced?
exactly where are they getting replaced?
Looks as though there is a conflict between source and browser charset. Usually source is from a word document and browser is UTF-8.
Google for php utf-8 convert and let us know if you find a solution.
http://stackoverflow.com/questions/7979567/php-convert-any-string-to-utf-8-without-knowing-the-original-character-set-or
On a tablet at the moment…
correct
it’s forté, not fortay
perhaps you did that on purpose? like, the way i always write “vwalah”
This is a page where it happens!
It should write ingrediënten but gives me ingredi�nten
+1 for spotting my deliberate mistake
I was hoping a member would return with a solution because I have similar problems on a short story site. One story written with " ISO-8859-1" another with “'WINDOWS-1251”. The following script is used and works well but only for the single charset. I have searched in vain for a dynamic charset detect solution
$from = ''ISO-8859-15'';
$to = 'UTF-8';[B][FONT=arial][SIZE=2][/SIZE][/FONT][/B]
$page = 'chapter-088';
$result = iconv( $from, $to, $page);
echo nl2br( $result );
@donbue, the site is static and the single occurrence would be far easier to hard-code rather than find a dynamic solution.
I assume the database table collation is set utf8_general_ci to match the HTML “<meta charset=“utf-8”>”?
I also notice that the English version is without the ?.
Nice site and making me feel hungry
Question marks and “weird characters” are almost always a character encoding issue.
When the font used doesn’t have a glyph that corresponds to the character the browser shows something else eg. a ?
If you look at a problem field’s value - from outside of a browser - what do you see?
Hi John. Yes the database table collation is set utf8_general_ci to match the HTML “<meta charset=“utf-8”>”! Indeed the English version is not showing because of e apostrophe ë. I tried to change the table collation to Latin1_german but without success.
Hi Mittineague. Not sure what you mean with this?
I mean instead of looking at the output of your code, use something like phpMyAdmin or MySQL Workbench to look directly at the field’s content before it get passed through the code and rendered by the browser.
If that content looks fine, then the next place to look is at the code that uses it.
In the database it looks okay, as it should be
And the character encoding is the same everwhere, your text editor, your web page, code and database?
Yes it is
OK it must be something with you code
I checked the page in IE, Opera, Chrome, Safari and Firefox and they all had the problem.
view source shows
<h1>Fan's Menukaart</h1>
<p>De menukaart van Fan's Kitchen is een onvergetelijke ervaring op zich.
Wij serveren een waaier aan exquise lekkernijen en heerlijke menu's waarbij
alleen de meest verse en beste kwaliteit ingredi�nten worden gebruikt welke
op vakkundig wijze worden bereid door chef Fan.
Er is ook een scala aan Chef's specialiteiten voor de meer avontuurlijke genieters.
</p>
But if I change the character encoding to ANSI I see
ingredi�nten
I’ll move this thread to PHP for now.
What character is that supposed to be?
Thank you for moving it to PHP.
It is a e apostrophe ë but the same happens to é
For some reason the e-umlaut which should be decimal 195 171 is decimal 194 191 and decimal 194 189
http://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec
I’m surprised other characters in that range are showing OK
@ScallioXTX; and @Stomme_poes; you’ve must have worked with Dutch text before. Got any suggestions other than sticking to English?
From the browser end, everything’s good: meta tag and server headers sending out UTF-8.
This means the mismatch happened before any browser requests, and the browser is not a part of this equation.
It means either the database is messing it up, or more likely, the original source of the text (before it goes into the db) is messed up. For example, if donboe or his clients are copying from another source that’s not in UTF-8 (say, a Word document), then the only hope would be for the DB to
-recognise the charset of the input (if it’s Windows program it very well might NOT be the expected Latin1/iso-8859! but some Windows-specific one like 1250)
- run a script to change it to UTF-8
Often these scripts miss stuff, but the characters involved here are pretty basic, and still within Latin-1, so I’d totally feel ok using a script for these characters.
It would be better if the copied source is first converted to UTF-8 before pasting into the DB. Even e-mail programs can be suspect. Get that text in the right charset before sending to the db.
If the users are directly typing these characters into a db admin panel using the International Keyboard setting (for example, typing the lowercase e and then an apostrophe/quote, which it sounded like donboe might be doing?), then I’m not sure… I would think this would still be a setting in the db, not to convert stuff but to simply start out with the correct (utf-8) encoding for direct input. I don’t use International Keyboard because it makes writing code a huge pain in the butt (typing quotes twice to make them appear once in code… bleh) so mine is set to US keyboard (which, in the Netherlands, is pretty common… but in Belgium probably less common, and donboe may have Belgian clients?).
BAD SPELLERS OF THE WORLD, UNTIE!
edit:
This is a good point, however this isn’t the case this time: this site uses verdana; and the other usual web fonts deal with the Latin-1 characters well, but it is something for others to keep in mind if using a @ font-face webfont (most of those also have full or near-full Latin-1 but some older ones may for example miss new currency characters and stuff).
Hi Stomme Poes. Thank you for the reply. I was indeed thinking that it must have been something with copy/paste. I always use Notepad when I need to copy and paste but I know from experience that a lot of people indeed use Word. To eliminate that that could be the reason I even have typed the e-umlaut and e-apostrophe straight in PHP Myadmin but without any change!
P.s. I have some Belgium clients, but this is for a Chinese Restaurants in the Netherlands
https://www.bluebox.net/insight/blog-article/getting-out-of-mysql-character-set-hell
looks like there can easily be problems within the mysql db… skip to “MySQL and Character Sets”
Oh-- and if this weren’t enough, MySQL can have different character encoding settings for the server, client, table, field, query results, replication, etc. What’s more, MySQL will transparently convert characters for you between character sets depending on how they’re used and accessed. (This can make troubleshooting these character set problems “interesting” to say the least.)
Man I’m glad we use postgres.