Sitemap. Help please!

I had spent so much time in this sitemap creation now…
I have some numbers for you guys. please help me out.
My PHP program crashes without giving a single error on creating this and i dunno whether is this really worth it.

There are totally 152106 records. Okay so we are going to generate 1000 url’s for each of this record.
So its 152106*1000 = 152106000 URLS.

So we have to split all these urls in different sitemap files and the Google terms for sitemap is each sitemap cannot be more than 10MB.
So for 50,000 URL’s it comes to 9.5MB. So say we put 50,000 URLS on each sitemap.
So its 152106000/50000 = 3042 sitemaps.
And each sitemap will be around 9 MB. so 3042 * 9 = 27378 MB = 26.7 GB

26.7 GB for sitemap’s and one sitemap index file… Is this worth???

Previously i had created such a sitemap for one site and google had indexed approx 83,000 URLS in a month and thats the oly reason pushing me to go for this sitemap again.

And one morething, (this is not the rite place for this one, but still) Does php script can do in this in a single script, note all those records are on db.

Sauraubh,
100 URL’s is nothing… Each sitemap with just 100 url’s and 100kb is way too less. M really sure that google’s robots can do way better and now they have their new indexing called caffeine as well. So we can surely go in Mb’s. For your reference the sitemap protocol allows 50k url and 10mb size and all major search engines adhere to this protocol.
http://sitemaps.org/protocol.php#index

But is 26 GB of sitemap’s worth??

Hi everybody

The original reason we provided that recommendation is that Google used to index only about 100 kilobytes of a page. When we thought about how many links a page might reasonably have and still be under 100K, it seemed about right to recommend 100 links or so. If a page started to have more than that many links, there was a chance that the page would be so long that Google would truncate the page and wouldn’t index the entire page.

Thanks and regards
Saurabh dhawan

Wat do u mean by that?? U meant to say about indexing of the url’s after i create this sitemap?? - If this is the case then i already have a site with 80,600 url’s indexed. But i would like to get more info on the probability of such huge indexing.

Or u are speaking about the php script to create sitemap?? - If this is the case then all the url’s are jus a mere reflection of the id’s in db… isn’t that possible in a single script?

I think the problem is sourcing from the number of links you want to create sitemap for them. Most ordinary servers ( even dedicated ones) cant do such huge job perfectly without errors

It doesn’t give any error, i tried 2 diff ways.

  • Gathering all records from db to a single array - in that case it dies there itself, i have set the mem limit to 2048M and i checked the size of the array with that memory_get_usage function but its far far less than tat 2048M. But its not passing tat loop itself.

  • Next method - i jus get one record at a time at loop through it and then the get the next one, in this case it works fine and creates approx about 23 files and just dies after tat , in this method i do not have even a single array so tat it doesn’t go out of mem. i reuse the same variables so this is not mem prob. And i clear my buffers on every loop as well.

I have set my error_reporting as E_ALL for both cases and have implicity set display errors on, does tat make sense??

K to be precise its not crashing or atleast i do not know really tat it is crashing but it just dies without give any error and it doesn’t complete the entire code as well.

I have set the time limit as 3000 secs, but m sure it does not even run for 30 secs…

What level is php’s error reporting set to?

When you say

My PHP program crashes without giving a single error on creating this

what do you mean by that, is Apache telling you that php has crashed or is it Apache that is crashing?

What is the “max_execution_time” set to in php.ini?