PHP Code:
// rm all but Numbers, letters and '., /'
$input = '0123 Big Street bc < < ?.,/#';
$output = preg_replace('#[^0-9a-z .,/]#i', '', $input);
echo($output);
Yeah, thats strange, I had to add the slashes in the regex to post it, but now it works?
Anyhow ... I posted that because it allows the developer to strip out what is not allowed, rather than taking the boolean "disallowed" decision.
It all depends how kind you want to be to your users.
I can also see value in this method I read about on a blog today talking about making clean url slugs:
"One can quickly see that a lot of characters will disappear from the URL. Imagine some accented characters (è, ë etc.) in the page title. These will be removed by the first line of the function since they are not exactly url-friendly. What we actually want is to convert these characters to their base-character, meaning that è would become a regular e etc.
PHP Code:
if (function_exists('iconv')) {
$string = @iconv('UTF-8', 'ASCII//TRANSLIT', $string);
}
That'd be even kinder, Y'know if someone live at "24 Rue de Cafè".
I must give it a go.
Bookmarks