SitePoint Sponsor

User Tag List

Results 1 to 6 of 6
  1. #1
    SitePoint Addict
    Join Date
    Mar 2002
    Location
    Michigan
    Posts
    260
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Stripping Data - RegEX

    Alright, I know this will be a regex but I've tried a bunch of stuff and can't get it to work. I'm using this piece of code to open a html file, inside the html file I have content surround with a tag like <BEGIN_DATA>CONTENT</END_DATA>
    I wanna put CONTENT into a variable.

    PHP Code:
      $handle = @fopen("files/$file.html""r");
      if (
    $handle) {
         while (!
    feof($handle)) {
             
    $buffer fgets($handle4096);
      
          
    $pattern "<BEGIN_DATA>(.*?)<END_DATA>";
      
              
    preg_match($pattern,$buffer,$match);
          
              echo 
    $match[0] . $match[1] . $match[2];
      
      
            
    // echo $buffer;
         
    }
         
    fclose($handle);
      } 
    It won't echo anything out no matter what... I must be doing something wrong.

  2. #2
    SitePoint Enthusiast Gonik's Avatar
    Join Date
    May 2005
    Location
    Thessaloniki, Greece
    Posts
    71
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    PHP Code:
    $handle = @fopen("files/$file.html""r");
      if (
    $handle) {
         while (!
    feof($handle)) {
             
    $buffer fgets($handle4096);
      
          
    $pattern "|<BEGIN_DATA>(.*)<END_DATA>|";
      
              
    preg_match($pattern,$buffer,$match);
          
              echo 
    $match[0] . $match[1] . $match[2];
      
      
            
    // echo $buffer;
         
    }
         
    fclose($handle);
      } 
    As you saw i added the "|" char in the start and in the end of the pattern.. This is called "delimiter".. I also removed the question mark from after .* because you don't need it... let me know if it works..
    Don't Drink & Surf The Net

  3. #3
    SitePoint Evangelist
    Join Date
    Jan 2005
    Posts
    502
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    try adding the s paramter to the end of the regexp, If the CONTENT breaks across lines in the file, the s parameter will match the end of line character as well

  4. #4
    SitePoint Addict
    Join Date
    Mar 2002
    Location
    Michigan
    Posts
    260
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Alright, I figured out part of the problem. There is some space between the BEGIN_DATA and END_DATA sections, the delimiter worked however instead of stripping ou tthe space I need a new RegEX. Any ideas?

  5. #5
    Maniacally depressed robot poncho's Avatar
    Join Date
    Dec 2004
    Location
    Belfast, N.Ireland
    Posts
    452
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    On reading this thread, I thought you could maybe use str_replace with an array to strip out the start and end tags. I don't know if this will work or not, but it seems like a simple way around the problem:

    PHP Code:
       $handle = @fopen("files/$file.html""r");
        if (
    $handle) {
           while (!
    feof($handle)) {
               
    $buffer fgets($handle4096);
        
                
    $content str_replace(array('<BEGIN_DATA> ''</END_DATA>'), array(''''), $buffer);
            
                echo 
    $content;
          }
           
    fclose($handle);
        } 
    Cheers;
    Poncho
    Perfecting the art of breaking stuff.
    Check 'em: CakePHP | TextMate

  6. #6
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by PlayOn
    Alright, I know this will be a regex but I've tried a bunch of stuff and can't get it to work. I'm using this piece of code to open a html file, inside the html file I have content surround with a tag like <BEGIN_DATA>CONTENT</END_DATA>
    I wanna put CONTENT into a variable.

    PHP Code:
      $handle = @fopen("files/$file.html""r");
      if (
    $handle) {
         while (!
    feof($handle)) {
             
    $buffer fgets($handle4096);
      
          
    $pattern "<BEGIN_DATA>(.*?)<END_DATA>";
      
              
    preg_match($pattern,$buffer,$match);
          
              echo 
    $match[0] . $match[1] . $match[2];
      
      
            
    // echo $buffer;
         
    }
         
    fclose($handle);
      } 
    It won't echo anything out no matter what... I must be doing something wrong.
    Add delimiters and 's' switch to your pattern:

    $pattern = "~<BEGIN_DATA>(.*?)<END_DATA>~s";


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •