SitePoint Sponsor |
|
User Tag List
Results 1 to 8 of 8
Thread: Help please. Darn foreign chars.
-
Dec 21, 2005, 18:09 #1
Help please. Darn foreign chars.
Hi,
I have a string with foreign chars in it. Looks like:
------
Ask many of his listeners what they think about his sermons and they’ll quickly respond with only words of acclamation. Follow that questions with a request for what the sermon was about and you’re met with only blank stares.
------
I have tried everything under the sun to remove the foreign chars, but no luck. The latest thing I tried was this function--and it does nothing.
PHP Code:function unaccent($text) {
static $search, $replace;
if (!$search) {
$search = $replace = array();
// Get the HTML entities table into an array
$trans = get_html_translation_table(HTML_ENTITIES);
// Go through the entity mappings one-by-one
foreach ($trans as $literal => $entity) {
// Make sure we don't process any other characters such as fractions, quotes etc:
if (ord($literal) >= 192) {
// Get the accented form of the letter
$search[] = $literal;
// Get e.g. 'E' from the string 'É'
$replace[] = "";//$entity[1];
}
}
}
return str_replace($search, $replace, $text);
}
Any help appreciated.
Thanks
-
Dec 21, 2005, 18:21 #2
- Join Date
- Aug 2004
- Location
- Canada
- Posts
- 1,280
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
echo ord("the character")
of each character you want to replace.
then check if ord('the character') = that number
if it does, replace it with a predefined replacement.
It's a bit of a hack, I'm sure there's a better way.GamesLib.com - the slickest, most complete and
easily navigatible flash games site on the web.
-
Dec 21, 2005, 18:28 #3
What number are you referring to?
The string is about 4000 chars long. I tried looping per char and making them all match a-z 0-9 or a couple punctuations, but that failed.
-
Dec 21, 2005, 19:03 #4
here's something to try:
PHP Code:$str = '~`!@#$%^&*()_+}|{":\?><,./\';\][=-and they’ll quickly respond with only words of acclamation. Follow that questions with a request for what the sermon was about and you’re';
$allow_chars = '~`!@#$%^&*\(\)_\-\+=|\}\{\[\]"\'\:;\?\/><\,\.\\\\';
$new_str = preg_replace("/[^\w\s" . $allow_chars . "]+/", "", $str);
echo $new_str;
Of course, if you're not storing this text in a DB, then ignore the last paragraph
-
Dec 21, 2005, 19:15 #5
yea, I am storing in Mysql.
So you are saying that I should run a check on my post form and convert first? I was not aware it mattered either way.
In your example would [=-and they blah blah
be the same as:
[=-$string
-
Dec 21, 2005, 19:30 #6
Originally Posted by MarketJunction
Originally Posted by MarketJunction
I just put all the special characters in there so to illustrate that they would be left alone while the non-ASCII characters would be deleted. Just add or delete any special characters from the $allow_chars variable that you do/don't want to accept. Note that all Perl regular expression special characters are excaped with a backslash.
-
Dec 21, 2005, 19:40 #7
Ok thanks. Think I will try alter before and screw everything else I have. Too many hours wasted.
I was able to get out word stuff, but the foreign chars won't budge.
Oh well.
-
Dec 22, 2005, 06:10 #8
- Join Date
- Jan 2004
- Location
- 3rd rock from the sun
- Posts
- 1,005
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
@seamonkey
Ditto - on the lost sanity and time. Came to the same conclusion as you.
theres a nice function strictify() on the user notes in the man on www.php.net/chr
That deals with most Word bad chars like curly quotes - but theres lots more really...
Bookmarks