SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    $foo = explode("\n",$foo); => strange result

    Hi,

    I have a html file that I would like to explode based on the number of lines it contains.

    I do a

    PHP Code:
    $foo explode("\n",$file_content); 
    I would expect the $foo array to have as many elements as there were lines in the html file... But no.

    I only get one element, containing the whole file, even though my html source definitely has a decent amount of lines.

    Where could the problem come from?

    Regards,

    -jj.

  2. #2
    SitePoint Member
    Join Date
    Jun 2007
    Location
    Gold Coast, Australia
    Posts
    11
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I am assuming you are reading the HTML file from the filesystem? In which case rather than trying to detect a newline character, which may vary depending on the host OS, would it not be easier to read the file line by line using fgets?

    Have a look at php.net/fgets

  3. #3
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    As Vinno pointed out, some OS' use \r\n, other uses just \n. You could either use file() which should autodetect or preg_split with an appropriate pattern.

    PHP Code:
    <?php
    $foo 
    preg_split('~(\\r?\\n)~'$foo);
    ?>
    PHP Code:
    <?php
    $foo 
    file('path/to/file/');
    ?>
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  4. #4
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your replies

    I read the html from a variable. I would have used file(), which would have been a lot easier...

    @SBUK: the preg_split() produces the same result as the explode() approach. I also tried with explode("\r\n",$foo)

    Here is typically some of the html hold in the variable:

    Code:
    <div id="container">
    
      <div id="header">
        
    <div id="main-title">
      <h1>
      Hello</h1>
    
    </div>
    
      </div>
    
        
        
      <div id="content">
        
    <div id="login">
    .....
    Regards,

    -jj.

  5. #5
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,510
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    How do you know the content of the variable? Did you do an echo? Try looking at the html code in your browser. It seems like your html does not contain any eol character.

  6. #6
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What I posted above is the source displayed by firefox as the result of echo($html), which is why I get so puzzled...

  7. #7
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I tried a little trick:
    PHP Code:
    $fp fopen($location.'html.php''w');
        
    fwrite($fp$html);
        
    fclose($fp);
        
        
    $file file($location.'html.php');

        
    print_r($file); 
    And...... same result. Yet if I open "html.php", I see the html code displayed as I posted it above. I also tried to save it as html.txt and to open this file: we can see the little squares that represent line jumps...

    I'm at a loss of explanations...

  8. #8
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    echo base64_encode($html);

    post the result here.

  9. #9
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,510
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by jjshell View Post
    What I posted above is the source displayed by firefox as the result of echo($html), which is why I get so puzzled...
    Did you right click on the firefox window choose 'show page source' or something like that (I'm translating from the italian version ) ?

  10. #10
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi crmalibu

    Encoded result:

    DTxkaXYgaWQ9ImNvbnRhaW5lciI+DQ0gIDxkaXYgaWQ9ImhlYWRlciI+DSAgICANPGRpdiBpZD0ibWFpbi10aXRsZSI+DSAgPGgxPg0gIEhlbGxvPC9oMT4NDTwvZGl2Pg0NICA8L2Rpdj4NICAgIA08L2Rpdj4=
    Quote Originally Posted by guido2004 View Post
    Did you right click on the firefox window choose 'show page source' or something like that (I'm translating from the italian version ) ?

    Yep

  11. #11
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Your string uses a carriage return as the new line indicater. eg "\r"

  12. #12
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    And you solved the problem, yet again

    rep++

    Now, could that be a problem portability-wise? should it be \n, or just \r, or both?

    And... why won't the regexp above do the job?

    Regards,

  13. #13
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That regex sais:
    "split on either \r\n or just \n"
    It never allows a split on just \r

    This will split on any
    Code:
    $lines = preg_split('#\r\n|\n|\r#', $html);
    // or
    $lines = preg_split('#\r?\n|\r#', $html);
    Having just \r is pretty rare. I've never knowingly encountered it personally, but I think it was the norm for old school macintosh. Then again I think I've used mac maybe a few times in my life :/

  14. #14
    SitePoint Wizard
    Join Date
    Jan 2005
    Location
    blahblahblah
    Posts
    1,447
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks again

    What would be your approach to align all the lines of the html source to the left, and remove empty lines? It's the first step I'm taking to working on a very simple class aiming to print html in a pleasant manner.

    I would like to avoid some ressource consuming choices I might make without knowing.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •