This has worked fine until i hit a feedburner feed that is different. It uses <entry> and <content type=“html” xml:lang=“en-US” xml:base=“http://rssdomain.com”>. It also uses <feedburner:origLink> for the proper link.
How can I make my php accept this new format as well as the normal method most rss feeds seem to use?
I don’t know about the question at hand but I know I’ve written an RSS scraper in the past and it’s a royal PITA because all of them are different and all of them lie about all sorts of stuff like character encoding, etc.
For a newer project I’ve used Zend_Feed_Reader from the Zend Framework [which can be used as a stand alone component] and in my experience that works like a charm. You might look into using that and saving yourself a lot of headaches down the road.
Even now when I think about it, dowloading a feed with an HTTP header claiming it’s UTF-8, an XML header claiming it’s ISO-8859-1 and then it turns out to be CP-1252, I still get the shivers a bit.
At one point I’d even written a function that kept ut8_decode’ing until it introduced extra question marks at which point it took the text from the last step as the final text. Brrr.