Regex to replace li tags with asterisk

Studio_Junkies · January 10, 2010, 7:53pm

Working with TinyMCE to enable editor to toggle off html mode, what I’m struggling with is converting list items into asterisks:

<ul>
<li>Bullet 1</li>
<li>Bullet 2</li>
<li>Bullet 3</li>
</ul>

Should become

Bullet 1
Bullet 2
Bullet 3

I’ve used a similar regex to convert paragraphs to "
$1

" and that is working, but I can’t seem to get the regex to work for list items, here’s my code:


// replace p tags with line breaks
strippedValue = strippedValue.replace(/<p>([^<\\/p>]*)<\\/p>/ig, "\
\
$1\
\
");

alert(strippedValue);

// replace list items with astrisks
strippedValue = strippedValue.replace(/<li>([^<\\/li>]*)<\\/li>/ig, "* $1\
");

alert(strippedValue);

At both alerts, the content remains the same:

<ul><li>Bullet 1</li><li>Bullet 2</li><li>Bullet 3
</li></ul>

PhilipToop · January 10, 2010, 8:35pm

<li>([^<\/li>]*)<\/li>

You are looking for a string that begins with <li> and finishes with </li> and has any characters other than <. /, l, i, > in between. Since the text Bullet contains ls the match is not made and no substitutions are done.

Try

<li>(.*?)<\\/li>

Studio_Junkies · January 10, 2010, 9:46pm

Ah, yes I see the problem that square brackets are evaluating matches against any of the characters within. That greedy .* was dumping all list items onto one line, I’ve got it working with this:


strippedValue = strippedValue.replace(/<li[^>]*>([^<]*)<\\/li>/ig, "* $1\
");

But its asking for trouble when someone uses < within the list item. Is there a way to use regex to match where as I originally wanted:

Assign to $1 all characters after <li> and before the next occurrence of </li>, I thought maybe

[^(?:<\\/li>)]

would do it, or maybe

(^<\\/li>)

but the ^ doesn’t appear to work within parentheses…

Studio_Junkies · January 11, 2010, 2:21am

Did a bit more reading and found that (.*?) is not greedy, the problem was caused by the markup having a new line character before the last closing </li> tag. The . operator doesn’t match new line breaks, so have updated to common work-around and it works, here’s the final code:


strippedValue = strippedValue.replace(/<p[^>]*>([\\s\\S]*?)<\\/p>/ig, "$1\
");
strippedValue = strippedValue.replace(/<li[^>]*>([\\s\\S]*?)<\\/li>/ig, "* $1\
");

Thanks for helping, Philip!

PhilipToop · January 11, 2010, 9:21am

In the same way as you put the i (case insensitive) and g (global) at the end you can also put s (treat as a single line) then matches occur across lines.

Topic		Replies	Views
Need Help with Manipulating string with HTML List PHP	3	351	May 21, 2010
Help with regular expressions Get Started	18	1903	October 8, 2014
Tegeting nested tags with regex PHP	2	708	February 28, 2011
How to replace <br> with </p><p> JavaScript	3	24142	August 20, 2010
Regex match jQuery selectors PHP	1	367	December 5, 2010

Regex to replace li tags with asterisk

Related topics