Extract a number from inside an HTML comment

What is the best way to extract 12345 when it resides in a html comment tag like this:


Thank you.

Doing some more searching online I came across what I think will work, substr.

$string = "<!--12345--> a small html comment in my example";

$sID = substr($string, 4, 5);

echo $sID;

This returns 12345, but how can I accommodate larger or smaller numbers? If 67812545 is the number, the 4, 5 parameters in substr will no longer extract the full number.

Please advise.

Before you end up running around in circles, make sure the spec is right. For example, I notice in your second post you added the text " a small html comment in my example" after the that you listed in the first post. Can you likewise have text in front:

here it is <!--12345-->Do you allow spaces, as:

<!-- 12345 -->What about fractional numbers:


There are probably other cases I didn’t think of. All of these can complicate the solution for what was originally a very simple extract.

I guess it night be possible to use DOM functions to extract comment nodes, but I don’t know as I’ve never needed to try it that way

The more common way is to use

This requires accessing the HTML as a String, so depending on how big that might be there may be a performance hit. Negligible or otherwise.

The string will ALWAYS begin with the following:


There will not be any spaces in the html comment. Does this help?

The string can sometimes include several paragraphs of text, equating to hundreds of words in length. I’m not sure how much would text would need to be involved in this function to take a performance hit.

One option might be:

$n = substr($s,4)-strstr($s,"-->");

Given your last comment about having a lot of text after the closing comment, I’m wondering though whether there’s a better solution that doesn’t involve subtracting the string like that.

May the Lord bless you tracknut for your kindness. What you have supplied works perfectly.

Thank you.

No problem. If you’re concerned about speed, I’m going to guess this:

$n = substr($s,4,strpos($s,"-->")-4);

would be faster. I haven’t timed it though.

What I’ve used before is a recursive directory search to get all PHP files, then using file_get_contents() put the PHP code into a string so I could then use preg_match()

No way I could have got this information without it.

A list of the 168 PHP Classes and 3,856 PHP Functions found in WordPress version 3.0

PHP and HTML files are both text files, so it works.

The trick is getting the regex right,

Thank you Mittineague.

I’ve always found strtr very helpful when trying to strip or do something to a string. You can use arrays to do so. It saves you time rather than making two of the same lines for stripping, you can actually use arrays to do that and it’ll just be 1 line.

$string = "<!--12345--> a small html comment in my example";
$array = ['<!-- ' => '', '<!--' => '', ' -->' => '', '-->' => ''];
$string = strtr($string, $array);
echo $string; // Returns 12345 without spaces and the <!-- nor -->

It also doesn’t affect any of your numbers if you don’t specify in the array.

Thank you spaceshiptrooper!

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.