I have some code that replaces double line breaks with <p> and single with <br>. However, I need to have some exceptions such that I don’t add <p>...</p> around <h2>, <h3>, <h4>, <ul> and also inside <ul>...</ul>
By not putting header tags inside of paragraph tags. (In all seriousness, your code takes whatever text you give it and puts a paragraph tag around it regardless. In the case of headers, thats already redundant.)
Your code appears to be designed to wrap non-tagged text with paragraph tags. So the idea would be just to not pass tagged text to this function in the first place.
Well I have database content as a variable that already contains the HTML tags, so unfortunately I need to apply the paragraph processing with them there already.
So if the database content already has the HTML tags, they should already have the paragraph tags too
It’s probably actually easier for you to edit your database entries to add the paragraph tags than it is to try and do all of this at runtime (repeatedly). But.
(time to spitball. Untested, and probably not the most elegant solution.)
preg_match_all the items you want to skip. Store them in an array.
preg_replace the items to be skipped with a tag, like “###TOREPLACE###”
paragraph wrap your text.
foreach entry in the array stored earlier,
preg_replace /(<p>)?###TOREPLACE###(</p>)?/ with the entry, putting a limit of 1 on the replacement.
In cases like these you’re far better off parsing the string bit by bit and keeping state of what’s going on and act on that, rather than trying to explode etc, which is almost always an approximation of what needs to happen.
For example, what would happen when a newline occurs within an H2? Stuff like that is really hard to fix when using a rigid explode way.
I would suggest to use a package like this to tokenize the HTML, then walk the array and process it into the structure you need. It would be a bit harder to do than simply explode etc, but the results will be much more robust and will be less sensitive to whimsical HTML (such as a newline within an header).
Also, processes like these are awesome to code using Test Driven Development because you have well defined inputs and expected outputs.
For example:
class HtmlProcessorTest extends TestCase
{
public function testReplaceDoubleNewlineWithParagraph()
{
$processor = new HtmlProcessor();
$this->assertEquals('<p>Hello</p><p>World</p>', $processor->process("Hello\n\nWorlds"));
}
public function testReplaceNewlineWithBr()
{
$processor = new HtmlProcessor();
$this->assertEquals('Hello<br />World', $processor->process("Hello\nWorld"));
}
public function testDoNotWrapHeaders()
{
$processor = new HtmlProcessor();
$this->assertEquals('<h1>Hello World</h1>', $processor->process("Hello\nWorld"));
}
}
You can then execute these tests using PHPUnit.
This has the advantages that:
You can see that what you wrote actually works
You can see that any addition/modification to the code doesn’t break behaviour of previously written code
Your coworkers can read the tests and see what the code is supposed to do
It costs some time to write the tests, but in the end you’ll find it will be well worth it.