How to limit the simplepie cache size?

Hello, is anyone know simplepie?
I subscribed many rss feeds from many rss source(50+) I just want to show one rss item from each rss source, but the size of cache folder is already over 15MB. I opened one cache with dreamweaver, I noticed the cache stored all the rss items from each rss source, and it caused ‘Maximum execution time of 30 seconds’ in simpleoie.inc… I always refresh 3 or 4 times, then show the whole home page.
How to limit the cache size? and avoid the ‘Maximum execution time of 30 seconds’?
Thanks.

What I mean is use simplepie.inc to fetch the items. Then use the functions SimplePie offers to get the seperate items from the RSS. Then store those items in the database. If you only want to store the first item of the feed than that’s fine too.

Then when you show the website, don’t use SimplePie at all, but show the items from the RSS feeds you’ve stored in the database.

So you only use SimplePie when you need to download items. Once they are downloaded you stick them in the database and take it from there.

Hope that makes sense :slight_smile:

Yes, access to a database is way faster than access to a file like the SimplePie caching mechanism uses.

I wouldn’t store the cache in the database, but rather get the separate items from SimplePie and store those items in the database.
Thus, only use SimplePie for the retrieval of the items, and don’t use any of it’s caching features, but cache the items for yourself in the database.
When you need to display items, just query the database.

Does that make sense?

What makes you think it’s the cache size that’s causing the timeout? IMHO it isn’t. It seems more likely to me your script can’t connect to the RSS source to fetch new items and that’s causing the timeout.

If you insist the cache is the problem you can try to decrease the cache duration using the set_cache_duration function, and if that fails you can always override the Cache object SimplePie should use using the [url=http://simplepie.org/wiki/reference/simplepie/set_cache_class]set_cache_class function and write a [url=http://simplepie.org/wiki/reference/simplepie_cache/start]SimplePie Cache Class of your own that limits the number of items stored.

Ah… database is faster for access? May be I will follow your suggestion.
By the way: Will it cause a lot of pressure for the database that store all the caches into it?

I wouldn’t use the SimplePie cache but a database for faster access, but you got the basic idea right, yes :slight_smile:

You mean they use cron to simulate vistors to refresh the cache with a timed interval? So that the server has always scrape the new rss first, then store them into cache, and other vistors just download the data from the cache, they do not need to send a command to scrape the RSS feeds directly?

I suppose they scape the RSS feeds server side with a timed interval using cron or similar. I know that’s how I would do it :slight_smile:

Yes, I scrape more than 50 feeds… And only need the first item of each other.
I noticed some other rss news site. they also scrape many feeds from many rss resorce, but home page opening rapid.(less than 1 second)
So I think I have a wrong method…

You’re welcome :slight_smile:

Oh ,at this time, I know all.
Thank you very much, ScallioXTX, for a kindly guide
:-}

Hello ScallioXTX, what is the most likely cause my script can’t connect to the RSS source? The links more than 50?The network surrounding? Or others? How to solve it?
Thanks.

The most common cause is that the site where you fetch the RSS from is slow to respond. With 50 did you mean you scrape 50 feeds? If so, that’s quite a lot to ask in 30 seconds, since that means PHP only has 0.6 seconds (on average) to fetch each feed. If only a few of them take longer than that the script will surely time out.

Excuse me for my poor english, You mean you separate simplepie.inc and store each item in the database? Or get the cache first few items store into the database what you need, then the page just load the needed items, not all af the items from one cache?

Hi , ScallioXTX. R U still there?
I met a problem. I’m now trying insert parsed RSS items into database.
but simplepie get the rss feeds from its resource need few seconds and insert into just one moment, that cause it always stores first few lows of RSS items, not all of them.
How to delay INSERT time that can store all of the rss items into mysql database?TXS.

I’m afraid I’m not really sure what you mean.
Could you post the code you have so far scraping / inserting in the DB ?

Sorry for my poor english, I use below code.
When I run this php script, I checked my mysql, sometimes it just insert 8 items of 10, sometimes only 6 items of 10.
So I think insert into is faster than simplepie get all the rss items.
How to delay the insert time that can insert all of the items?


<?php
require_once ('condatabase.php'); 
require_once ('simplepie.inc');

$url = 'http://news.google.com/news?pz=1&cf=all&ned=jp&hl=jp&topic=w&output=rss';
 
$feed = new SimplePie();
$feed->set_feed_url($url);
$feed->init();
 
// default starting item
$start = 0;
 
// default number of items to display. 0 = all
$length = 10; 
 
// if single item, set start to item number and length to 1
if(isset($_GET['item']))
{
        $start = $_GET['item'];
        $length = 1;
}
 
// set item link to script uri
$link = $_SERVER['REQUEST_URI'];
 
// loop through items
foreach($feed->get_items($start,$length) as $key=>$item)
{
 
        // set query string to item number
        $queryString = '?item=' . $key;
 
        // if we're displaying a single item, set item link to itself and set query string to nothing
        if(isset($_GET['item']))
        {
                $link = $item->get_link();
                $queryString = '';        
        }
 
        // insert item into datebase 

 mysql_query("INSERT INTO rss (link, title, date, content) VALUES ('".$item->get_link()."', '".$item->get_title()."', '".$item->get_date()."', '".$item->get_content()."')");

}
 
?>

No, that is not the case. The line


$feed->init();

is loading the RSS items from the external feed, and the script will not continue processing the rest of the script until simplepie has downloaded all the items.

Since the problem is not that MySQL is too fast, the only possibility is that there are no more than 6 items. I mean, if you have 6 pieces of candy, and I say to you “Give me 10 pieces of candy” the best you can do if to give me 6, since you don’t have any more.

Does that make sense?

BTW. In order to avoid problems with SQL Injection, take a look at [fphp]mysql_real_escape_string[/fphp]

Okay, since you also started a new thread on this here I’ll just go ahead and close this one :slight_smile: