SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Evangelist
    Join Date
    Jun 2001
    Location
    Houston, Texas, USA
    Posts
    559
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    curl and parsing suggestions needed

    I need some help trying to figure out how to do this with PHP.

    I want to create a PHP program that downloads a web page and parses out all of the colors used within that web page.

    I am able to download a page using CURL, but not the included CSS files. I guess I need to do those separately?

    Is there a parsing function I should use to extract the colors?

    Any ideas would appreciated.

  2. #2
    ko pročita magarac :) boccio's Avatar
    Join Date
    Oct 2003
    Location
    belgrade
    Posts
    354
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Interesting problem.

    As for CSS, suppose the easiest way is to look for <link rel="stylesheet" type="text/css" (...) /> line and extract CSS filename with preg_match(). Then, use cURL in next iteration to fetch CSS file...

    Now, colors can come in many flavors. Are you seeking for text colors, bground colors, or literally everything? Bottom line, you have to use series of preg_match() to find all HTML / CSS variations and extract colors...
    Vivvo CMS - Web publishing at your fingertips
    Mile voli disko, a ja belo kolumbijsko

  3. #3
    SitePoint Evangelist
    Join Date
    Jun 2001
    Location
    Houston, Texas, USA
    Posts
    559
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes. I want to grab all colors. (except image colors, of course)

  4. #4
    SitePoint Evangelist
    Join Date
    Jun 2001
    Location
    Houston, Texas, USA
    Posts
    559
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Where's the best tutorial on preg_match()? Thanks!

  5. #5
    SitePoint Evangelist
    Join Date
    Jun 2001
    Location
    Houston, Texas, USA
    Posts
    559
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If anyone's interested, I wound up using:

    preg_match_all('/#[0-9a-f]{6}/', $buffer, $matches);

    which works ok if the hex has a pound sign in front of it.

    Now I'm trying to find the correct regex code to use to find something like bgcolor='336699' or bgcolor=336699 or bgcolor="336699".

    Any regex experts out there???


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •