PHP function to turn double new lines into paragraph but with exceptions

skyline · February 15, 2019, 7:54am

I have some code that replaces double line breaks with <p> and single with <br>. However, I need to have some exceptions such that I don’t add <p>...</p> around <h2>, <h3>, <h4>, <ul> and also inside <ul>...</ul>

How can I achieve this?

$text = nl2br($text, false);
$text = '<p>' . preg_replace('#(<br>[\r\n\s]+){2}#', '</p><p>', $text) . '</p>';

m_hutley · February 15, 2019, 8:53am

By not putting header tags inside of paragraph tags. (In all seriousness, your code takes whatever text you give it and puts a paragraph tag around it regardless. In the case of headers, thats already redundant.)

Your code appears to be designed to wrap non-tagged text with paragraph tags. So the idea would be just to not pass tagged text to this function in the first place.

skyline · February 15, 2019, 8:57am

Thanks for the reply!

Well I have database content as a variable that already contains the HTML tags, so unfortunately I need to apply the paragraph processing with them there already.

m_hutley · February 15, 2019, 9:29am

So if the database content already has the HTML tags, they should already have the paragraph tags too

It’s probably actually easier for you to edit your database entries to add the paragraph tags than it is to try and do all of this at runtime (repeatedly). But.

(time to spitball. Untested, and probably not the most elegant solution.)

preg_match_all the items you want to skip. Store them in an array.
preg_replace the items to be skipped with a tag, like “###TOREPLACE###”
paragraph wrap your text.
foreach entry in the array stored earlier,
preg_replace /(<p>)?###TOREPLACE###(</p>)?/ with the entry, putting a limit of 1 on the replacement.

skyline · February 15, 2019, 10:45am

Well it’s a custom CMS so /n are used for saving unfortunately.

I was actually thinking best to split on double line breaks (not single) and then take each array item and check whether contains certain HTML tags.

Then skip ones that do and pre/post p tags on the ones that don’t. Then recombine into one variable at end.

Say I have an array with HTML tags to ignore. What’s the best code for this approach?

rpkamp · February 17, 2019, 12:52pm

In cases like these you’re far better off parsing the string bit by bit and keeping state of what’s going on and act on that, rather than trying to explode etc, which is almost always an approximation of what needs to happen.

For example, what would happen when a newline occurs within an H2? Stuff like that is really hard to fix when using a rigid explode way.

I would suggest to use a package like this to tokenize the HTML, then walk the array and process it into the structure you need. It would be a bit harder to do than simply explode etc, but the results will be much more robust and will be less sensitive to whimsical HTML (such as a newline within an header).

rpkamp · February 17, 2019, 1:11pm

Also, processes like these are awesome to code using Test Driven Development because you have well defined inputs and expected outputs.

For example:

class HtmlProcessorTest extends TestCase
{
    public function testReplaceDoubleNewlineWithParagraph()
    {
        $processor = new HtmlProcessor();
        $this->assertEquals('<p>Hello</p><p>World</p>', $processor->process("Hello\n\nWorlds"));
    }

    public function testReplaceNewlineWithBr()
    {
        $processor = new HtmlProcessor();
        $this->assertEquals('Hello<br />World', $processor->process("Hello\nWorld"));
    }

    public function testDoNotWrapHeaders()
    {
        $processor = new HtmlProcessor();
        $this->assertEquals('<h1>Hello World</h1>', $processor->process("Hello\nWorld"));
    }
}

You can then execute these tests using PHPUnit.

This has the advantages that:

You can see that what you wrote actually works
You can see that any addition/modification to the code doesn’t break behaviour of previously written code
Your coworkers can read the tests and see what the code is supposed to do

It costs some time to write the tests, but in the end you’ll find it will be well worth it.

system · May 19, 2019, 8:21pm

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need Returns in HTML Source PHP	14	1800	October 8, 2014
Double results when using str_replace PHP	5	795	October 8, 2014
Replacing <br> with new lines problem PHP	7	10729	October 8, 2014
Paragraphs after inserting text using a text area PHP	27	14084	October 8, 2014
Regex to remove empty paragraphs? PHP	6	11749	October 8, 2014

PHP function to turn double new lines into paragraph but with exceptions

Related topics