I’m creating a parser for text files that contain a very limited subset of markdown commands (strong, em, h2, h3 and h4). (It seems a bit highbrow to call it a parser!)
strong and em are simple enough but the headers are giving me a headache as the end hashes are optional.
My code for strong is
$para = preg_replace('#\*{2}(.*?)\*{2}#', '<strong>$1</strong>', $para);
$para = preg_replace('#\_{2}(.*?)\_{2}#', '<strong>$1</strong>', $para);
Can anyone suggest the regex for h2 which would work with
## Header 2
or
## Header 2 ##
And I believe even
## Header 2 #####
is valid!
Yes, change your second {2} to be {2,} (which handles situations 2 and 3, the ## and ##########
For handling the lack of a closing, you can put it in ()? to make it optional.
Something like the following “should” work, in theory
'#\#{2}(.*?)(\#{2,})?#'
But in practice, I seem to be missing something…
Got it!
$para = "## HEADER ##############";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;
$para = "## HEADER ##";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;
$para = "## HEADER";
$para = preg_replace('#\#{2}([^\#]+)\#*#', '<strong>$1</strong>', $para);
echo $para;
All of the above produce:
<strong> HEADER </strong>
Many thanks, cpradio. I’m still trying to understand the regex but the main thing is it works!
Thanks again G
Ah, I’m more than willing to explain it
#\#{2}([^\#]+)\#*#
Obviously the first and last # are your qualifiers, telling you the expression resides in between these two characters.
The \#{2} matches looks for 2 # signs.
The ([^#]+) is my hack to find ALL characters that are not a # sign and to ensure at least 1 character that is not a # sign exists (that is the + modifier)
The \#* tells it to find zero or more ending # signs (thus making it optional, as it permits 0 matches)
Should there be some kind of whitespace handling in there so the unterminateds won’t get carried over to subsequent lines?
Good thinking batman. I think what I have will do for my purposes though - thanks G