SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Member SPTony's Avatar
    Join Date
    Sep 2005
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    A little help...

    Heres what I'mm trying to do...

    I'm trying to get something from a source code from a site in between specific tags

    like say myspace.com

    i want to get the information inbetween the <title>asdasf</title> so it will output
    asdasf

    i know i have to do preg_match but im not sure where to start..

  2. #2
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    PHP Code:
    preg_match("#<title>(.+?)</title>#i",$data,$matches);
    print_r($matches); 
    Saul

  3. #3
    An average geek earl-grey's Avatar
    Join Date
    Mar 2005
    Location
    Ukraine
    Posts
    1,403
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Have a look at Snoopy and libcurl.

  4. #4
    SitePoint Addict AfroNinja's Avatar
    Join Date
    Oct 2006
    Posts
    246
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    preg_match("!<title>([^>]*)</title>!i",$text,$matches);
    echo $matches[1];

    where $text contains the source code of your page. $matches[1] will contain your title.
    The Flash Gaming Network
    Editorial reviews for the latest flash games!
    Afro Ninja Productions
    Original flash games and content from a guy with an afro

  5. #5
    SitePoint Evangelist
    Join Date
    May 2006
    Location
    Austin
    Posts
    401
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    To get the page, you can use file_get_contents.

    PHP Code:
    $text file_get_contents('http://www.mypage.com'); 
    Merchant Equipment Store - Merchant Services, POS, Equipment, and supplies.
    Merchant Account Blog | Ecommerce Blog

  6. #6
    SitePoint Addict AfroNinja's Avatar
    Join Date
    Oct 2006
    Posts
    246
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by php_daemon View Post
    PHP Code:
    preg_match("#<title>(.+?)</title>#i",$data,$matches);
    print_r($matches); 
    ok... here's what I don't get. The question mark makes the .+ non greedy, making it match as little as possible. .+ indicates one or more of any character. Since the pattern is non greedy shouldn't it stop after the first character it matches? and if not why does it choose to stop when it hits < ?
    The Flash Gaming Network
    Editorial reviews for the latest flash games!
    Afro Ninja Productions
    Original flash games and content from a guy with an afro

  7. #7
    SitePoint Member SPTony's Avatar
    Join Date
    Sep 2005
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it kind of works...it shows the page but... it echos the whole page not the title.

  8. #8
    SitePoint Addict AfroNinja's Avatar
    Join Date
    Oct 2006
    Posts
    246
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    what is the url of the page you are trying to do this with
    The Flash Gaming Network
    Editorial reviews for the latest flash games!
    Afro Ninja Productions
    Original flash games and content from a guy with an afro

  9. #9
    SitePoint Member SPTony's Avatar
    Join Date
    Sep 2005
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    myspace.com

  10. #10
    SitePoint Addict AfroNinja's Avatar
    Join Date
    Oct 2006
    Posts
    246
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    PHP Code:
    $text=file_get_contents("http://myspace.com");

    preg_match("!<title>([^>]*)</title>!is",$text,$matches);
    echo 
    $matches[1]; 
    the 's' modifier needed to be added to recognize newlines. Anyway in this case the file will echo 'Myspace'

    www.afro-ninja.com/pregtest.php
    The Flash Gaming Network
    Editorial reviews for the latest flash games!
    Afro Ninja Productions
    Original flash games and content from a guy with an afro

  11. #11
    SitePoint Member SPTony's Avatar
    Join Date
    Sep 2005
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it doesnt do that for me it still does the whole page.

  12. #12
    SitePoint Addict AfroNinja's Avatar
    Join Date
    Oct 2006
    Posts
    246
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hmm. show us your php code?
    The Flash Gaming Network
    Editorial reviews for the latest flash games!
    Afro Ninja Productions
    Original flash games and content from a guy with an afro

  13. #13
    SitePoint Member SPTony's Avatar
    Join Date
    Sep 2005
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    $text=file_get_contents("http://myspace.com");

    preg_match("!<title>([^>]*)</title>!is",$text,$matches);
    echo $matches[1];

  14. #14
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by AfroNinja View Post
    ok... here's what I don't get. The question mark makes the .+ non greedy, making it match as little as possible. .+ indicates one or more of any character. Since the pattern is non greedy shouldn't it stop after the first character it matches? and if not why does it choose to stop when it hits < ?
    It would match just the first character if there were no < specified. Since there is, it takes up everything till <.
    Saul


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •