-
SitePoint Zealot

Hello,
I'm trying to grab the yahoo homepage, and store each printed word (i.e. browser outputed words only) as an element in an array.
Grabbing and parsing the page is no problem using:
$fp = fsockopen ("yahoo.com/", 80, &$errnr, &$errstr, 5);
fputs($fp,"GET $whatever HTTP/1.0\r\n\r\n");
e.t.c............
This produces a whole host of words that can then be split into an array (using " " as a delimiter).
But, the problem is, that elements in the array still contain lots of unwanted data (I only want the browser output).
So far I'm having to use strip_tags and str_replace a hell of a lot, and I'm still not achieving perfection (i.e. unwanted data remains.)
So do you know of an easy and effective way to achieve the goal of only browser outputted words please?
Thanks,
Jason
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
Bookmarks