Php code to replace bad words in a document with *** and maintain break line

Hi all,

What is the ideal Php code for replacing ban words in a document with *** for middle characters while maintaining full break line display?

Keeping in mind that:
1- The input can have many lines with various line breaks and will come from TextArea
2- The changes are to be made to the Text before it is saved to MySQL table
3- So that when the Text is retrieved from MySQL it has the changed words
4- Ban words are to change like: “S!!!y” to “Sy" “F!!!P” to "FP”, so 3 stars is to replace the middle chars of Ban Words while keeping Case for 1st and last Chars
5- Ban words are in an Array

FYI, we have our code in this regard but it is not maintaining line breaks at all time perfectly, so I am looking for ideas from Php GODs here :slight_smile:

Thanks,

Post the code you have, someone will offer an idea of how it could be improved. Off the top of my head, I’d suggest stripos() to find the dodgy words and a quick function to replace the centres with three asterisks, but that’s obviously not working for you. I don’t envy whoever has to maintain the list of new “bad” words.

Droop,

Here is the code we are using now. Of course simplified a bit for pasting here in public:

if (isset($_POST[‘submit’])) {

$message = $_POST['comments'];

$msg_list = explode(' ', $message);

$bad_words = array('f!!k', 's!!t', 'p!!!y');

for ($i=0; $i < count($msg_list); ++$i) {

	$word = $msg_list[$i];
	$chk_word = strtolower($msg_list[$i]);


	for ($j=0; $j < count($bad_words); ++$j) {

		$bad_word = $bad_words[$j];

		if (strpos($chk_word, $bad_word) !== false) {

				$f_char = substr($word, 0, 1);

				$length = strlen($word);
				$index = $length - 1;

				$l_char =  substr($word, $index, 1);

				$msg_list[$i] = $f_char . '***' . $l_char;
		} 

	}//Closes Inner For Loop

}//Closes Outer For Loop

$result = implode(' ', $msg_list);

}

Oh and when the message is to be printed out, then this:

echo nl2br($result);

I would think using str_ireplace or maybe preg_replace (if you need something more complex) would be more efficient for this as they will take arrays directly as inputs saving you from all the explode, implode and loops you are using and do the replacemnt for you within the single function.

Though these things can be more complex than you may first think, when you start to consider false positives and the methods people may use to circumvent your efforts.

^ Indeed. Script can only do so much. If someone is determined to get around censoring it will require manual attention to keep things to a standard. Considering the very real possibility of both false positives and false negatives (eg. a word that might be bad in one context might be perfectly fine in a different context) I wouldn’t hope for perfection and would be prepared for manual intervention.

Sam,

Yes, of course you are right about "saving you from all the explode, implode and loops you are using "

But what is a working replacement solution?
I mean I have no problem and would welcome a suggestion code to replace what we are using, as long as gets the Job done as described which again are:

1- Ban words is an array or can be a string of words comma seperated

2- Text to be filtered of bad words can have many lines and many line breaks, so it is not a simple line but a whole document

3- Replace the letters in the Words matching ban words with ** in the middle Chars keeping in place the 1st and last char

4- Text after ran through the Banned words must maintain same exact line break formatting

So what code do you suggest would get this Job done?

Thanks,

Well since no one has a better solution, better code, than what I posted for this rather normal ask in Php message processing. Then we went ahead with this code we had created. FYI, it works fine 99.99% of cases. Just in those rare cases that the Banned word is the last Word of a line, then it misfires. Which so rarely is the case, which is fine.

I guess this makes me one of the Php GODs :slight_smile:
Since no one had better code for this task.

Take a look at the array that explode() returns. I can’t recall ever having read anything about it, but it’s there to see with a simple var_dump in a test file.

What I would consider the best and most flexible way to do this would be to delegate it to a known vendor API. I did a quick search and came up with this result.

These are the types of strings you would want to search for to come up with better solutions than just a regex match.

“detecting profanity in string using an api”
“detecting profanity in string using php library”

1 Like

Are you saying that we send our Text messages to this resource outside of our company (server) for filtering banned Words? If yes, that is definitely something we cannot and will not do, due to various security concerns. Also keep in mind that one of our Req is that the:

Replace the letters in the Words matching ban words with ** in the middle Chars keeping in place the 1st and last char

Which this code does not seem to do at all.

So it seems like my Code remains the only choice! Hence I for sure one of Php Gods now :slight_smile:

There were two examples of searches you could do yourself to find a pre-made solution:-

The first one involves APIs which you have dismissed.
The second involves libraries which may suit you better.
If Googling things for yourself is beyond the effort you are prepared to make, there were some library solutions linked in the topic I linked to in post #5 (which was a big part of the reason I linked to it).

6 Likes

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.