Preg Match Issue - Pulling My Hair Out

I’ve got the following code that I’m trying to make work, and I’m not seeing the problem. Hoping a fresh set of eyes can point me in the right direction:


<?php
$handle = fopen("mls.txt", "rb");
$contents = '';
while (!feof($handle)) {    
    $contents .= fread($handle, 8192);
}
fclose($handle);
preg_match_all("/[ID-.](.*)[:]/", $content, $matches);

foreach ($matches[1] as $url) {
     echo $url . "<br />";
}
?>

Among alot of other text, the text file contains lines that look like this:

ID-.29203029:

I’m trying to extract ONLY those numbers from the file and echo them out line by line. Please help!

The way I read
preg_match_all("/[ID-.](.*)[:]/", $content, $matches);
is
Match everything that has a single I,D,minus sign, or period character
Followed by anything, nothing, everything
Last character match is the last colon character

That looks like it should find what you’re after.
But it also looks like it might find what you’re not after.

Which kind of problem are you having?

Try this, it will get you over the hump until you get the preg match to work:


<?php
        $handle  = fopen("mls.txt", "rb");

        $words=array();
        while (!feof($handle))
        {    
          $lines = fread($handle, 222);
          echo 'strlen($lines) == ' .strlen($lines) .'<br />'; // debug

          $words = explode(' ', $lines);

          foreach ($words as $word)
          {
            // $xx = 'ID-.29203029:';
            if('ID-.' === substr($word,0, 4) && strpos( $word, ':'))
            {
             echo $word ."<br />";
            }
          }
        }
        fclose($handle);
      
        echo '<pre>'; 
          highlight_file('mls.txt');
        echo '</pre>';
      ?>          

 

“/ID-.[:]/”

Square brackets are character classes… one thing about them: they stand for one character in the target text, so [ID-.] says match an I or a D or a - or a . in one position in the target text (although that’s not true because of the hyphen…). Hyphens, not at the start of the list of characters in the square brackets make a range, so [0-9] means any digit character. Not sure what D through to . would amount to. Anyway, you don’t want to use a character class there. You just want the characters literally, so without the square brackets, also the . will then need escaping:

"/ID-\.

If they’re only going to be digits after ID-. then you might aswell make use of that:

"/ID-\.([0-9]*)

And, although I don’t think the square bracket round a single character, colon, makes it not work, it’s not needed, just literal as is. So give this a go, see if it works:

“/ID-\.([0-9]*):/”

(In fact I don’t think the colon is needed at all so you could probably remove that.)

Thanks all. Between the three of you, I managed to get it to do what I needed! You all are incredible!

Is it possible to share your solution for the benefit of other posters who stumble on this thread?

Good. I just thought, one little change should be change the * to +, so:

[COLOR=#333333]“/ID-\.([0-9]+)/”

is what you want. Before, with the *, the digits were optional, now with the +, they’re not (+ means “1 or more”, one or more digits in this case).[/COLOR]