Catching Bad Words

What is the best way to catch people posting “bad” words?

My client’s website allows registered members to post comments on content, but they want to keep things “family-oriented”.

Should I write a script which looks for bad words and replaces them with *****, or are they other solutions out there?

I think this is the only solution
You can get list of “bad words” and check it before adding comment to database

What server-side language is in use?

Gotta be careful with things like this. For example (as we recently found out, here), a spam filter that we employ had (past tense) “Cialis” as one of the trigger words to divert email to a spam folder.

It took us MONTHS to realise that any email containing the word “specialist” would trigger the spam filter.

I work at a military installation - “specialist” appears in many emails.

Now, I know that these aren’t “bad” words, but I just threw that out there as an example of good intentions gone awry. :smile:

V/r,

:slight_smile:

2 Likes

PHP

Which is why it would seem to me that writing my own “Bad Word” function would be a better approach.

There isn’t one plug-in on my client’s website and for good reason!

Yet at the same time, why recreate the wheel?

Maybe this question would be better asked in the PHP forum?

If I wrote my own function, I am a little unclear of how you would loop through and check each word in a long post against a database. (This is more complicated than just using a Regular Expression on single word.)

Perhaps… and I will be honest, I don’t know how they normally work. But, it seems to me that you could use a RegEx on the whole input (instead of just one word at a time), and make sure that the “bad” words are properly delimited by space/punctuation (ie, from my example, " cialis ", but RegEx the spaces as \s (whitespace) and/or ::punc:: (punctuation) so that safe words that contain potential bad words within can pass muster… like the word “association”.)

HTH,

:slight_smile:

Try this for starters:

<?php 
	$badWords = array 
	(
		" ass",
		"-ass",
		" cialis",
		" bad-001",
		" bad-002",
		" bad-003",
		" worse-001",
		" worse-002",
		" worse-003",
		" even-worserer-001",
		" even-worserer-002",
		" even-worserer-003",
	);

	$userContent_001 = 'safe words'; 
	$userContent_002 = 'this is a "bad-ass" test to see if "specialist" or "assist" works with bad-001 or worse-002 or even-worserer-003<br />';

	$words = badWords($userContent_001, $badWords);
	$words = badWords($userContent_002, $badWords);

function badWords($userContent, $badWords, $showBadWords=true) 
{
	$result = array();
	
	foreach($badWords as $badWord) {

	  $pos =  strpos($userContent, $badWord);
		if ($pos === false) {
		  // echo '<br />The string ' .$badWord .' was not found in this $userContent';

		} else {
			$result[] = $badWord;
		}
	}

	if( $showBadWords ) {
		if( count($result) ) {
			$msg  = 'BAD words found in :<br /><b>' .$userContent .'</b>';
			$msg2 = '<pre>Bad Words found: '
				.print_r( $result, true )
			.'</pre>';	
			echo $msg .$msg2;
		}else{
			echo 'No BAD words found in :<br /><b>' .$userContent .'</b><br />';
		}
		echo '<hr />';
	}
	
	return $result;
}

@John_Betong,

Thanks for the code tip.

This is a crazy week for me - am trying to land a new contract.

I will try out your code as soon as I can, though!

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.