SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Evangelist
    Join Date
    Oct 2000
    Posts
    430
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    I was wondering if anyone has an idea on how "Anaconda" accomplish the changing or URL's when they "fetch" remote headlines.

    As an example of what I mean have a go with this:
    http://www.anaconda.net/cgi-local/ap...clip_cust.html

    Enter a URL into the box on the right hand side and it fetches the page.

    When you look at the code for the fetched page you see that all the URL are correctly formatted, even though within the original page there may be a number of different types of links:

    ie root realtive, absolute, relative etc etc.

    Does anyone have any idea on how they do this? - and how to accomplish this in PHP ?

    Cheers for the advice in advance

  2. #2
    Serial Publisher silver trophy aspen's Avatar
    Join Date
    Aug 1999
    Location
    East Lansing, MI USA
    Posts
    12,937
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Regular Expressions.

    They're basically used to do flexible find and replace functions. You should be able to find a good tutorial on regular expressions somewhere online - I'm not sure if Kevin covers them in his tutorial or not.
    Chris Beasley - I publish content and ecommerce sites.
    Featured Article: Free Comprehensive SEO Guide
    My Guide to Building a Successful Website
    My Blog|My Webmaster Forums

  3. #3
    SitePoint Enthusiast nguip's Avatar
    Join Date
    Apr 2001
    Location
    Malaysia
    Posts
    95
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes... Kelvin covered a little about regular expession in his famous php tutorial

    You can check on this one also ..

    http://phpbuilder.com/columns/dario19990616.php3
    Ngu I.P.
    Web Developer

  4. #4
    SitePoint Evangelist
    Join Date
    Oct 2000
    Posts
    430
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yeah, I believe regex plays a part - but I think there may also be something else needed as well (maybe something similar to basename() - I'm not sure

    Say you had a page http://www.thesite.com/sport/football.htm

    within the page there where link as follows:

    /books.htm
    ../../hotels.htm
    ../contact.htm
    http://www.anothersite.com
    homes.htm

    What sort of regex could you use to correctly format these links?

  5. #5
    Serial Publisher silver trophy aspen's Avatar
    Join Date
    Aug 1999
    Location
    East Lansing, MI USA
    Posts
    12,937
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well you'd need to use more than one statement. You'd make one statement to handle hard links (http....)

    one statement to handle relative links (home.htm)

    one statement to handle the upwards likes (../home.htm)

    But regular expressions are flexible enough that you can easily do this.

    You'd probably also need to use some arrays to get down the directory structure.
    Chris Beasley - I publish content and ecommerce sites.
    Featured Article: Free Comprehensive SEO Guide
    My Guide to Building a Successful Website
    My Blog|My Webmaster Forums


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •