How to remove � from my strings?


A string I’m trying to import into my database is coming in with some �� characters attached to it, and I’m having trouble getting them removed.

The data being imported is a person’s name, and looks like this when echoed: ‘Smith, Bob John��’

I want the �� removed. Or, if it’s easier the other way by identifying what I need to keep, I need all the alpha numeric characters, white spaces, and a comma to be kept… everything else can be discarded.


$subject = "a B5,%";

$pattern = "/[a-z0-9, ]+/i";

preg_match($pattern, $subject, $matches);


Thanks! Exactly what I needed. How would you modify that expression if you needed to include & (ampersands), - (dashes), . (periods), and / (forward slashes).

I tried the following, but it doesn’t appear to have worked with the dashes:

$pattern = “/[a-z0-9, &.-\/]+/i”;

Nevermind… just had to escape the dash and it worked. Thanks again!

If you know what you want to allow in, you should prefer to dump everything NOT matching that.

example: the {6,12} can be removed, that means min 6 chars, max 12 chars:

// rm all but Numbers, letters, dash and space.
$input = '0123 Big Street-South bc < < ?.,/#';
$output = preg_replace('#[^0-9a-z -]{6,12}$#i', '', $input);
echo $output . '<hr>';

Bear in mind that names can have many culture can contain characters other than a-z. eg O’Reilly.

Regarding your original request, that looks like a character encoding problem which might jump up and bit you elsewhere.

Where is the original data coming from, a text file by any chance?

Such value display in output when there is some extras pace left at the time of coding, so for this you can use the ACSII code of Space and remove it when found