SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 31 of 31
  1. #26
    SitePoint Guru
    Join Date
    Aug 2004
    Location
    Canada
    Posts
    730
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Grossman View Post
    file_get_contents() returns the contents of a file or URL as a string. If these book pages are on a drive somewhere, then they're files, if they're on the web somewhere, then they get downloaded before the contents can be returned.

    I'm still telling you that you don't need, or want, to use any of those DOM functions. This isn't JavaScript, they aren't easy nor do they save you time here.
    Ok. But I still don't know what to look for. How am I going to be able to extract div tags from another page?

    Do the $_REQUEST, $_GET, or $_POST do that?
    Compare bible texts (and other tools):
    TheWheelofGod

  2. #27
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    The HTML of the page is in the result from the file_get_contents() call. You would use a regular expression to get the text from between div tags in that string.

  3. #28
    SitePoint Guru
    Join Date
    Aug 2004
    Location
    Canada
    Posts
    730
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Grossman View Post
    The HTML of the page is in the result from the file_get_contents() call. You would use a regular expression to get the text from between div tags in that string.
    What do you mean by regular expression?
    Compare bible texts (and other tools):
    TheWheelofGod

  4. #29
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    http://www.google.com/search?q=defin...ary_definition

    The functions you want are preg_match and preg_match_all.

  5. #30
    SitePoint Guru
    Join Date
    Aug 2004
    Location
    Canada
    Posts
    730
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Grossman View Post
    http://www.google.com/search?q=defin...ary_definition

    The functions you want are preg_match and preg_match_all.
    The first part where the div tags are removed worked with the preg_match_all. But if I want to extract the style attributes:
    Code:
    <DIV STYLE="POSITION:ABSOLUTE;TOP:456;LEFT:71" CLASS="APFont00000">...</DIV>
    How do I do that?
    Compare bible texts (and other tools):
    TheWheelofGod

  6. #31
    SitePoint Guru
    Join Date
    Aug 2004
    Location
    Canada
    Posts
    730
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by gilgalbiblewheel View Post
    The first part where the div tags are removed worked with the preg_match_all. But if I want to extract the style attributes:
    Code:
    <DIV STYLE="POSITION:ABSOLUTE;TOP:456;LEFT:71" CLASS="APFont00000">...</DIV>
    How do I do that?
    Ok I figured it out. Thanks Dan because it was all in what you said: preg_match_all

    All I had to do is play around with '|TOP:(.*);|U', '|LEFT:(.*)"|U', '|CLASS="(.*)">|U' and add $topout, $leftout, $classout:
    PHP Code:
        preg_match_all('|TOP:(.*);|U'$contents_of_page$topoutPREG_SET_ORDER);
        
    preg_match_all('|LEFT:(.*)"|U'$contents_of_page$leftoutPREG_SET_ORDER);
        
    preg_match_all('|CLASS="(.*)">|U'$contents_of_page$classoutPREG_SET_ORDER); 
    PHP Code:
    <?php
    $num_pages 
    292;
    for (
    $i 1$i <= $num_pages$i++) {
        
    $num $i;
        while (
    strlen($num) < 5) {
            
    $num '0' $num;
        }
        
    $contents_of_page file_get_contents('.../new/page' $num '.htm');
        
    preg_match_all('|TOP:(.*);|U'$contents_of_page$topoutPREG_SET_ORDER);
        
    preg_match_all('|LEFT:(.*)"|U'$contents_of_page$leftoutPREG_SET_ORDER);
        
    preg_match_all('|CLASS="(.*)">|U'$contents_of_page$classoutPREG_SET_ORDER);
        
    $totaldivs 2;
        for (
    $j 1$j <= $totaldivs$j++) {
            
    $div[$j] = $topout[$j][1]." ".$leftout[$j][1]." ".$classout[$j][1];
            
    //if(mysql_real_escape_string($div[$j])!=''){
                //$sql = "INSERT INTO boti_pages (page_num, content) VALUES (" . $i . ", '" . mysql_real_escape_string($div[$j]) . "')";
            //mysql_query($sql);
            
    echo $div[$j]."<br />\n";
            }
    }
    ?>
    The outcome gives the TOP, LEFT attributes of STYLE and also the CLASS attribute in my DIV tag.
    Compare bible texts (and other tools):
    TheWheelofGod


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •