Replace class in table using preg_replace

Hey Everyone,

I have this string below, now I want to replace class from all tables (only the classes).
What is the best way to do this using a preg_replace?

<table border="0" cellpadding="0" cellspacing="0" id="sheet0" class="sheet0 gridlines tablepress tablepress--matrix">

… why?

Surely this is easier to do by editing the HTML, rather than trying to use a backend engine to replace it on every page load.

It comes from a parser, is it is not replaced on every pageload

Again, it would be easier to change the source than to do it programmatically.

Anyway. What have you tried so far? You’ve said you want to do it with preg_replace, so… what’s your current pattern?

I started with this <table class(.*?)>

Mkay. So let’s consider what you actually want.

  • You want table tags.
  • You only want to modify the class part of it.
  • Table tags can have many attributes.
  • The class attribute is not always in the same place.
  • Class attributes will have the form of:
    • the word class
    • optionally a space.
    • an equals sign
    • optionally another space.
    • a string, encapsulated either with a pair of single quotes, or a pair of double quotes. (If you want to specify that your code only finds double-quoted strings, so be it.)

So; give me a pattern that says the following.

Inside a table tag, find three subpatterns: Everything before the class attribute, the class attribute, and then everything after the class attribute.

<table.class="(.*?)".>

(You had it nearly right the first time. You need to capture all 3 subpatterns, and they need to capture as many characters as necessary, non-greedily.)

So; you’ve now got the building blocks to build your replacement text with.

Backreferences are the key here. Manual source, if needed.

Your replacement string is the string <table, the first subpattern, the string class=", your replacement class(es), the string ", the third subpattern, and then the string >

First of all, thank you for the reply.

When I do this <table.class="(.*?)".>
I only get the first line

https://regexr.com/4qin0

apply the g modifier.
(By default, PCRE will stop after the first match. You have to tell it to search globally.)

Okay lets say I want to do this in PHP

And I have this replacement

$rawDataFromSheetData = preg_replace("/<table.class=\"(.*?)\".>/gmU", "babs", $rawDataFromSheetData);

I get this message Warning: preg_replace(): Unknown modifier 'g' in.

Beside that it will replace the complete string? (if it is working)

Needed to be
(<table.*class=")(.*?)(".*>) Since it was not only one space

preg_replace automatically globalizes its search.

So you’ve got the blocks; your replacement string at that point would be "\1babs\3" (“the first subpattern, then your replacement, then the third subpattern”)

1 Like

That first .* should be lazy instead of greedy like it is now. Otherwise it might just match the class of a differt tag that follows the table.
Also, the last construct can be a bit more efficient like this ([^"]*)" - which means, 0 or more times anything but a double quote, followed by a double quote.

So in total it would become (<table.*?class=")([^"]*)" and then the replacement would become \1babs".

Hey @rpkamp,

I am always really happy if you are responding. Since you give a lot of good advice (Thank you for it ).

What if you want to change only the classes in a PHP preg_replace. (Since that the thing is I wanted to do in the First place). (<table.class=")(.?)(".*>) in the middle, I group the classes and replace them. with another class. But (<table.*?class=")([^"]*) has only two selectors (everything between <table *** class="). And the second I not really get.

It is a negative set, so it will get everything except the " ?. How do you all the classes? Or will it until it finds the "?

Because your class is greedy (*), it will consume all characters until it sees a double-quote, but it will not consume said double quote; that said, the way you have written it, [^"]*\w will never match anything, because you’re saying "Consume everything until you see a ", and then followed by a whitespace. But there isnt a whitespace following it - it will always be a double quote.

Note that when Remon give you the string, he said (<table.*?class=")([^"]*)" . That last " is not a typo.

3 Likes

^ what he said :grin:

1 Like