Getting a list of what was replaced

I’m sure the answer is “No”, but is there some way when using preg_replace to get some sort of list of what was replaced?

I have created a “bad word” checker, and was hoping to get a tally of any bad words found.

For example, it would be useful to know that in a user’s comments they said “Damn” once, “Crap” twice, and “F***” four times!!

Looking in the Manual, it appears the function will return a total count, but what would be much more useful is to get an array containing all offending words so I can then do analysis on the replacement!

Is there any reasonable way I could accomplish this?

According to the online PHP Manual:

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

Return Values ¶

preg_replace() returns an array if the subject parameter is an array, or a string otherwise.

If matches are found, the new subject will be returned, otherwise subject will be returned unchanged or NULL if an error occurred.

@John_Betong,

I am feeding preg_replace a string, so I don’t see how what you posted would help.

Here is an example of what I have so far…

post-comment.php

    $text = "A bird just took a sh1t on my glasses.  Frick all birds!";
    
    replaceBadWords($dbc, $text);

functions.php (snippet below)

        $replacement = '****';

        $new = preg_replace($badWordsArray, $replacement, $text);

Use the appropriate tool? Run it through preg_match_all first? That provides an array of matches.

I’m not following you or John…

I repeat… ^

Let me explain my desired end goals…

Originally I just wanted to feed my function some block of text (e.g. User Comment) and have it replace any bad words with ****.

So I used preg_replace.

That makes sense, right?

Well, after I got that working last night, I decided it might be useful to do some analysis on the violator and keep some sort of tally on their cussing ways!

So as a new requirement, I wanted to get a list of all bad words from the submitted text. (However, the process above remains unchanged.)

I apologize if what you are suggesting is obvious to you, but I don’t see how using preg_match_all will help my main goal of replacing bad words in a block of text. (To me it sounds like it would just create an array that would be disconnected from the text.)

You have to do both. preg_match_all first to get the list of bad words, preg_replace_all to replace them with ****

So I can keep my current code which has the $patterns as an array, the $replacement as a string, and the $source as a string, and then use preg_match_all as a completely separate process?

Yes, run it before you do the replacement though, as it would be sort of silly to run it on the paragraph that already has the words replaced :wink:

Try using JavaScript’s “match” fiunctions

Thanks, but I don’t know JavaScript.

@cpradio,

I don’t think your suggestion will work, because the Manual says…

I need “pattern” to be an array…

Use a loop then…

Huh? about time you at least learned the basics or you’ll be left in the dust.
After HTML and CSS it’s a must.

Also… http://stackoverflow.com/questions/683702/how-do-you-perform-a-preg-match-where-the-pattern-is-an-array-in-php

Sounds like the wrong tool to me…

preg_match_all is intended to have a generic pattern and find all occurrences of said pattern and place them in an array.

By contrast, I have a pre-defined list of bad words, and I need to find all occurrences of them in a body of text.

If I used a loop, then that would create a zillion arrays with one-off matches…

There has to be an easier way to do what I want?

I am only one man! I spend most of my time trying to figure out PHP and MySQL!

Says the person trying to get the matches from pre_replace_all…

Which preg_match_all will do one word at a time…

Not if you program it properly.

Just the same, best you invest some time getting familiar with it, it’s the future of the web