Best
Looking for a new kind of script that reads from an RSS feed and displays nicely as a title and a date and time below.
Does anyone know a script for this or a website?
Thank Thank
| SitePoint Sponsor |
Best
Looking for a new kind of script that reads from an RSS feed and displays nicely as a title and a date and time below.
Does anyone know a script for this or a website?
Thank Thank
Best regards,
Nick van Beers

This is the kind of simple job you can do with your own scripts using a simpleXML.
You'd likely fetch the rss file once per hour and cache it locally first, then format the title/date as you want ( or stick it in a database).
The need to feed a web site and the news show from a different site
Best regards,
Nick van Beers
Good luck with that.
I wrote about parsing rss feeds here:
http://webmaster.lampcms.com/p86494-..._feed_parsing/
and also in this post:
http://www.sitepoint.com/forums/showthread.php?t=653230
My project: Open source Q&A
(similar to StackOverflow)
powered by php+MongoDB
Source on github, collaborators welcome!

Nick, perhaps you could explain more about what you want to do?

Nick van Beers, u should try this onem magpie rss
http://magpierss.sourceforge.net
magpie is the lamest rss parser of all. I actually used it a long time ago but that was before I learned how to dress myself.
My project: Open source Q&A
(similar to StackOverflow)
powered by php+MongoDB
Source on github, collaborators welcome!





I used Carp which is very nice and then built one with simpleXML
What I lack in acuracy I make up for in misteaks
List of good rss parser
http://www.webresourcesdepot.com/php-rss-parsers/

r u sure lampcms.com, i know about magpierss from the book title " Building Findable Websites", writered by aarron wlater,blame him,not me![]()
Yes, a long time ago magpie was widely popular because it was the only one available. But since then may more have been written and are better. Magpie relies havily on regular expression to parse xml, which is not the best way to go. The good parsing must rely on standard technologies like DOM and XML.
As far as I know magpie does not treat the extracted item as own DOMDocument, and does not even attempt to sanitize the input.
Think about it - you normally very carefull about filtering html that user can submit to your site via a form, yet you trust to take content from external feed?
Lastly, the very important part of feed parsing is to deal with ofter incorrectly reported on not at-all declared charset encoding of the feed. This should be done in the very first stage of pre-parsing and this is where you can also sanitize the string to remove ill-formed unicode characters that can even be of malicious nature.
magpie can parse feed as well as they are totally well-formed but the nature of the feed is that the well formed feed today may be seen as non-well formed the next day.
For example the feed may have charset declared as utf-8, but in fact be using ISO-8859-1. So because these charsets are similar one can easily validate as the other, depending what chars are included in the text. Also one day the feed may include some unescaped html tag or ampersand.
You just have to be ready for all possible problems with the feed. The most important thing - don't trust the publisher, validate charset, validate xml, validate html, sanitize charset, sanitize html, repair html (watch out for unclosed html tags, etc.)
The very first thing - requesting and downloading the feed. This is when you have to be ready to deal with timeouts, with server http codes other than 200 (like 404 if feed has gone away for good).
Then there is an issue to ask for the feed the right way, respecting the Last-Modified and Etag headers (part of http 1.1 protocol). Feedburner will ban you from their best servers if you keep requesting feed without using these headers and will serve you feed from their secondary servers that may have not the latest feeds.
All and all, when you consider all these, the feed parsing is complicated.
My project: Open source Q&A
(similar to StackOverflow)
powered by php+MongoDB
Source on github, collaborators welcome!

thanks for your information lampcms.com. i always trust book then website
but right now, I'm in sitepoint.com. I must trust website than book,![]()
I just downloaded the latest magpie because I got curious that maybe the new version has new feature.
Sure, it now have more features than when I first used it.
It now appears to support http headers like Last-Modified and Etag and to deal with timeouts, however they are using some lame library for that instead of using something more standard like CURL or HttpRequest class (pecl)
Also they now attempt to deal with encoding issues but right away I spotted that they are getting their encoding info from the value of xml declaration, which means they trust the publisher. This is where you can get burned - by trusting publisher to correctly identify their charset. Also xml standard recommends but does not require to include the charset encoding in xml declaration. The XML standard, surprisingly, does not even require the xml file to have xml declaration in the first place, but the feed standards require it, so that's at least a good news.
Also, they using iconv and mbstring (if available), but not using utf8_encode where it would be more appropriate and even the iconv and mb_convert_encoding they did not get completely right.
There is no utf8 validation.
I am sure magpie is very capable of parsing feeds, otherwise it would not be a popular choice, but just saying that they are not doing the xml parsing the right way and they will be many situations where magpie will fail to parse a feed that could still be parsed if done better.
My project: Open source Q&A
(similar to StackOverflow)
powered by php+MongoDB
Source on github, collaborators welcome!




You can also use Zend_Feed for this purpose: http://framework.zend.com/manual/en/zend.feed.html
I used it on couple of websites and it works great![]()
Bookmarks