SitePoint Sponsor

User Tag List

Results 1 to 14 of 14

Thread: Rss

  1. #1
    SitePoint Enthusiast Nick van Beers's Avatar
    Join Date
    Nov 2009
    Posts
    33
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Rss

    Best

    Looking for a new kind of script that reads from an RSS feed and displays nicely as a title and a date and time below.

    Does anyone know a script for this or a website?

    Thank Thank
    Best regards,

    Nick van Beers

  2. #2
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    This is the kind of simple job you can do with your own scripts using a simpleXML.

    You'd likely fetch the rss file once per hour and cache it locally first, then format the title/date as you want ( or stick it in a database).

  3. #3
    SitePoint Enthusiast Nick van Beers's Avatar
    Join Date
    Nov 2009
    Posts
    33
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The need to feed a web site and the news show from a different site
    Best regards,

    Nick van Beers

  4. #4
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Good luck with that.
    I wrote about parsing rss feeds here:
    http://webmaster.lampcms.com/p86494-..._feed_parsing/

    and also in this post:
    http://www.sitepoint.com/forums/showthread.php?t=653230
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  5. #5
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Nick, perhaps you could explain more about what you want to do?

  6. #6
    SitePoint Zealot revivalx's Avatar
    Join Date
    Dec 2009
    Location
    Kuala Lumpur,Malaysia
    Posts
    138
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nick van Beers, u should try this onem magpie rss
    http://magpierss.sourceforge.net


    Web advertising solution

    Business is not about money, it is about trust..

  7. #7
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    magpie is the lamest rss parser of all. I actually used it a long time ago but that was before I learned how to dress myself.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  8. #8
    SitePoint Wizard lorenw's Avatar
    Join Date
    Feb 2005
    Location
    was rainy Oregon now sunny Florida
    Posts
    1,101
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    I used Carp which is very nice and then built one with simpleXML
    What I lack in acuracy I make up for in misteaks

  9. #9
    SitePoint Wizard PHPycho's Avatar
    Join Date
    Dec 2005
    Posts
    1,201
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  10. #10
    SitePoint Zealot revivalx's Avatar
    Join Date
    Dec 2009
    Location
    Kuala Lumpur,Malaysia
    Posts
    138
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    r u sure lampcms.com, i know about magpierss from the book title " Building Findable Websites", writered by aarron wlater,blame him,not me


    Web advertising solution

    Business is not about money, it is about trust..

  11. #11
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes, a long time ago magpie was widely popular because it was the only one available. But since then may more have been written and are better. Magpie relies havily on regular expression to parse xml, which is not the best way to go. The good parsing must rely on standard technologies like DOM and XML.
    As far as I know magpie does not treat the extracted item as own DOMDocument, and does not even attempt to sanitize the input.

    Think about it - you normally very carefull about filtering html that user can submit to your site via a form, yet you trust to take content from external feed?

    Lastly, the very important part of feed parsing is to deal with ofter incorrectly reported on not at-all declared charset encoding of the feed. This should be done in the very first stage of pre-parsing and this is where you can also sanitize the string to remove ill-formed unicode characters that can even be of malicious nature.

    magpie can parse feed as well as they are totally well-formed but the nature of the feed is that the well formed feed today may be seen as non-well formed the next day.
    For example the feed may have charset declared as utf-8, but in fact be using ISO-8859-1. So because these charsets are similar one can easily validate as the other, depending what chars are included in the text. Also one day the feed may include some unescaped html tag or ampersand.

    You just have to be ready for all possible problems with the feed. The most important thing - don't trust the publisher, validate charset, validate xml, validate html, sanitize charset, sanitize html, repair html (watch out for unclosed html tags, etc.)

    The very first thing - requesting and downloading the feed. This is when you have to be ready to deal with timeouts, with server http codes other than 200 (like 404 if feed has gone away for good).
    Then there is an issue to ask for the feed the right way, respecting the Last-Modified and Etag headers (part of http 1.1 protocol). Feedburner will ban you from their best servers if you keep requesting feed without using these headers and will serve you feed from their secondary servers that may have not the latest feeds.

    All and all, when you consider all these, the feed parsing is complicated.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  12. #12
    SitePoint Zealot revivalx's Avatar
    Join Date
    Dec 2009
    Location
    Kuala Lumpur,Malaysia
    Posts
    138
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thanks for your information lampcms.com. i always trust book then website
    but right now, I'm in sitepoint.com. I must trust website than book,


    Web advertising solution

    Business is not about money, it is about trust..

  13. #13
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I just downloaded the latest magpie because I got curious that maybe the new version has new feature.
    Sure, it now have more features than when I first used it.
    It now appears to support http headers like Last-Modified and Etag and to deal with timeouts, however they are using some lame library for that instead of using something more standard like CURL or HttpRequest class (pecl)

    Also they now attempt to deal with encoding issues but right away I spotted that they are getting their encoding info from the value of xml declaration, which means they trust the publisher. This is where you can get burned - by trusting publisher to correctly identify their charset. Also xml standard recommends but does not require to include the charset encoding in xml declaration. The XML standard, surprisingly, does not even require the xml file to have xml declaration in the first place, but the feed standards require it, so that's at least a good news.

    Also, they using iconv and mbstring (if available), but not using utf8_encode where it would be more appropriate and even the iconv and mb_convert_encoding they did not get completely right.

    There is no utf8 validation.

    I am sure magpie is very capable of parsing feeds, otherwise it would not be a popular choice, but just saying that they are not doing the xml parsing the right way and they will be many situations where magpie will fail to parse a feed that could still be parsed if done better.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  14. #14
    SitePoint Guru risoknop's Avatar
    Join Date
    Feb 2008
    Location
    end($world)
    Posts
    834
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can also use Zend_Feed for this purpose: http://framework.zend.com/manual/en/zend.feed.html

    I used it on couple of websites and it works great


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •