Parsing simple markdown markup

I’m creating a parser for text files that contain a very limited subset of markdown commands (strong, em, h2, h3 and h4). (It seems a bit highbrow to call it a parser!)

strong and em are simple enough but the headers are giving me a headache as the end hashes are optional.

My code for strong is

$para = preg_replace('#\*{2}(.*?)\*{2}#', '<strong>$1</strong>', $para);
$para = preg_replace('#\_{2}(.*?)\_{2}#', '<strong>$1</strong>', $para);

Can anyone suggest the regex for h2 which would work with

## Header 2

or

## Header 2 ##

And I believe even

## Header 2 #####

is valid!

Yes, change your second {2} to be {2,} (which handles situations 2 and 3, the ## and ##########

For handling the lack of a closing, you can put it in ()? to make it optional.

Something like the following “should” work, in theory

'#\#{2}(.*?)(\#{2,})?#'

But in practice, I seem to be missing something…

Got it!

$para = "## HEADER ##############";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;

$para = "## HEADER ##";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;

$para = "## HEADER";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;

All of the above produce:

<strong> HEADER </strong>

Many thanks, cpradio. I’m still trying to understand the regex but the main thing is it works!

Thanks again G :slight_smile:

Ah, I’m more than willing to explain it :smile:

#\#{2}([^\#]+)\#*#

Obviously the first and last # are your qualifiers, telling you the expression resides in between these two characters.

The \#{2} matches looks for 2 # signs.

The ([^#]+) is my hack to find ALL characters that are not a # sign and to ensure at least 1 character that is not a # sign exists (that is the + modifier)

The \#* tells it to find zero or more ending # signs (thus making it optional, as it permits 0 matches)

Got it! Cheers G :slight_smile:

Should there be some kind of whitespace handling in there so the unterminateds won’t get carried over to subsequent lines?

Good thinking batman. I think what I have will do for my purposes though - thanks G :slight_smile: