Applying the '\\' escape character to unescaped single quotes (') with preg_replace()?

I’d like to escape single quotes in a string with the backslash (\) character, but only escaping the single quotes that aren’t already escaped.

To elaborate, the reason I’m doing this is so I can make the string safe for use with the eval() function, e.g. eval(‘$new = \’‘.$string.’\‘;’);

At first I just did

$string = str_replace("'", "\\'", string );

But I ran into problems with strings where a single quote was already escaped. So instead I’m doing

$string = preg_replace("#[^\\\\\\]'#", "\\'", $string);

But I know this is not the correct way to do it, because I can run into strings where perhaps the backslash preceding the single quote is actually applied to something else rather the single quote, for instance \\\’ or some other sequence of mutiple backslashes preceding the single quote.

The regex also doesn’t do what I’d expect - I expected the proper escape pattern would be #[^\\]‘#, but that gives me a “Compilation failed: missing terminating ] for character class” warning, so I used the ^\\\ pattern instead; I don’t understand why ^\\ failed. And the ^\\\ pattern doesn’t match what I’d expect it to - for instance \’’ turns into \\'. I don’t understand how the regex works there.

What is the proper approach to take here, and what is the reason the ^\\ pattern fails?

First I want to point out that eval is evil !

That being said:

\\ fails because regex sees that as \ and expects some value behind it (like “n” for
).

So, you need to use \\\\

Four backslashes; the first escapes the second, the third escapes the fourth, so now there are two, of which the first escapes the second and the result is \, one backslash.

Both the php parser, as well as the regex parser use \ as an escape character. First, you need to manage with the php parser.


$str = "#[^\\\\\\]'#";
echo $str;

Then you can see what the regex parser gets.

But…stop and use var_export()
Make sure eval() is really a good choice for whatever it is you’re doing, it often isn’t a good path to continue on.

Thanks. However if I apply

$string= preg_replace("#[^\\\\\\\\]'#", "\\'", $string);

To the string

$srting = "' \\'' '"; // Those are two single quotes, not a double quote

The result is

’ \\‘\’

Which is the same as with the ^\\\ three backslash pattern. Am I doing something wrong? The result I’d like is

\’ \‘\’ ’


$srting = "' \\'' '";

$srting = preg_replace("#[\\\\\\\\']#", "\\'", $srting);

If you really want to do this with regular expressions, then use a lookbehind assertion to check for the non-existence of a slash character before any single quote. E.g.


$subject = "' \\'' '";
$pattern = "/(?<!\\\\\\\\)'/";
$replace = "\\'";

echo preg_replace($pattern, $replace, $subject); // Outputs: \\' \\'\\' \\'

As raised before, why are you generating strings to be evaled anyway? The solution may be to not use eval where at all possible.

For more info on regular expression (for example for negative lookbehind), see

Thank you, although this seems to introduce extra \’ instances, for instance it turns $string here into

\’ \‘\’\’ \’

Instead of

\’ \‘\’ ’

Thank you, I was unaware of lookbehinds :slight_smile:

As raised before, why are you generating strings to be evaled anyway? The solution may be to not use eval where at all possible.

I’m sure that would be ideal as you say, but the codebase I’m working with uses eval and it’s probably not feasible for me to try and change this.

I forgot, but the issue of possibly encountering multiple backslashes still seems to remain. For instance:

$string = "None ' One \\' Two \\\\' Three \\\\\\' Four \\\\\\\\' Five \\\\\\\\\\' stop";
$string = preg_replace("#(?<!\\\\\\\\)'#", "\\'", $string);
eval('$new = \\''.$string.'\\';');
echo "$new";

I don’t know how to approach dealing with that. The above givevs a parse error because of the backslash complexity - the regex can’t tell whether the backslash is escaping the single quote or something earlier.

And actually for future reference I would like to know how to avoid the use of eval, even if I can’t do that for my current problem, if anyone has a suggestion on how to do that :slight_smile: The more general case of what I’m trying to do is while parsing some XML, given an arbitrary $tagName and $tagValue,

$tagName = preg_replace("#(?<!\\\\\\\\)'#", "\\'", $tagName);
$tagValue = preg_replace("#(?<!\\\\\\\\)'#", "\\'", $tagValue);
$doString = '$array[\\'' . $tagName . '\\'] = \\'' . $tagValue . '\\';';
eval($doString);

I’m trying to assign $array[‘$tagName’] the value of $tagValue.


$array[$tagName] = $tagValue;

Cute, but sadly not applicable the the issue the codebase is addressing :slight_smile: I’d probably have to communicate the problem more fully I suppose but that wouldn’t be practical.

Still would like suggestions on the multiple backslash issue.

You aren’t being particularly forthcoming with why on earth all of this hacking together needs to be done; as the thread progresses it should be a better, completely different, solution if you weren’t insistent on doing it this way.

That said, you just need to change the regular expression to behave differently given the new set or scenarios that you want it to match and change. Try the following, which adds a backslash before a single quote only if preceded by an odd-number of, or zero, backslashes before it.


$string = preg_replace("#(?<!\\\\\\\\)(?:\\\\\\\\{2})*\\K'#", "\\'", $string);

Off Topic:

Wonders whether it would be nice to have a similar @ symbol (as per C#) in PHP to indicate a string that is not escaped

Yes you’re right, the fault in communication is mine - it’s just that I don’t know how to properly condense the problem, and posting the entire scope would probably be asking too much of anyone trying to help :slight_smile:


$string = preg_replace("#(?<!\\\\\\\\)(?:\\\\\\\\{2})*\\K'#", "\\'", $string);

Amazing! Brilliant work, many thanks :slight_smile:

Er, question - does the \K in the pattern

#(?<!\\\\\\\\)(?:\\\\\\\\{2})*\\K'#

Require PHP5? The regex originally worked for me, but that was on a server with PHP 5.2. Now I’m trying the same exact code on a different server, and it gives me a Parse Error. This server has PHP 4.4.7. Removing the “\K” gets rid of the parse error, though the result of the pattern is obviously a little different.

Also what is the name of the \K modifier after the non-capturing group? I couldn’t find the documentation for it.

Yes, it is only available as of PHP 5.2.4. There is (a little) documentation here: http://php.net/regexp.reference.backslash Since the current stable version is right up there at 5.2.10, I figured \K was safe enough to make use of… I guess I figured wrong. :blush:

A quick work-around for versions without \K available would be to capture what went before it (i.e. the slashes if any) and use that in the replacement. E.g.


$string = preg_replace("#(?<!\\\\\\\\)((?:\\\\\\\\{2})*)'#", "$1\\'", $string);

It’s for use on a web server. My main web server has PHP5, but almost no web hosts have 5; someone would have to have a dedicated machine and then ask their host to upgrade (almost no shared host would upgrade on request).

A quick work-around for versions without \K available would be to capture what went before it (i.e. the slashes if any) and use that in the replacement. E.g.


$string = preg_replace("#(?<!\\\\\\\\)((?:\\\\\\\\{2})*)'#", "$1\\'", $string); 

Awesome :slight_smile: Thanks again.