From there, my dirty word function would know how to handle “MUCK”.
However, if someone types this…
I really like lemon-lime soda with my hamburger
…then it collapse “lemon-lime” to “lemonlime” which is not what I want!
I would like to tell PHP or Regex the following:
If you see a word where there is more than one hyphen in the series then do as I did above. But if you just find one hyphen, then assume the word is okay.
Of course, that wouldn’t help with “sweet-n-sour” or “Rock-N-Roll”, but one step at a time!!
The PHP function you’re looking for is preg_replace (http://www.php.net/manual/en/function.preg-replace.php) where you will set the pattern as ‘-[a-zA-Z]±)’ which will then search for - followed by any letters - or use \w+ rather than my [a-zA-Z]+ then another -. Don’t forget to use the $1 for the atom (the letters in the middle) captured by this regex. Because M-U-C-K may be M - U - C - K, you may want to use optional space characters (\s?) around the hyphens.
As for your sweet-n-sour, the pattern there is a single character in the middle but multiple characters on the outside of the -n- so you could use ‘(\w\w+)\s?-\s?(\w+)\s?-\s?(\w\w+)’ to cover this case, too.
Recommendation: Friedl’s Mastering Regular Expressions is the master treatise on regular expressions and well worth the cost (O’Reilly).