If my website has more than 100k links, should I put all of them in sitemap.xml? What are the impacts of not putting all of the links and what about putting too much links?
I’ve look into IMDB and other big content Website and none has a sitemap.xml so far… weird.
Google creawl pages in micro seconds… so its not possible to be on the same time it will take…
i have my website i have just 75 to 80 page and not crawled on the same time… it crawled after 10 to 15 pages to crawled so be passions… if it’s not been so long days…
I have added global translator plugin on my blog and because of that, it has created more than 45K pages…
One of my blog is having more than 45K links and I was also thinking to split the xml sitemap of my blog but the problem is, even my xml sitemap has got the page rank and I don’t want to lose that. How can I do this?
Out of interest, how were you able to write over 45k different blog entries?! I’ve worked on large corporate sites before that have just barely touched 40k.
you can split them to multiple sitemap files.
its good for internal linking and crawling having all of them listed. if you miss something, i don’t think there would be a huge impact.
Unless things have changed since I last looked, the sitemap.xsd has
<xsd:element name="urlset"> <xsd:annotation> <xsd:documentation> Container for a set of up to 50,000 document elements. This is the root element of the XML file. </xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element ref="url" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element>
even though it’s “unbounded” I wouldn’t go over 50K
If some of your URLs point to the same page, I would use “canonical”.
If you really do have that many unique pages look into using a siteindex.xml file to point to the multiple sitemap.xml files.
If you have to choose, I would focus on adding all your important pages to your sitemap. Say for instance if you run a forum, then I would leave out “profile” pages and limit the sitemap to threads/topics
If you have 100k pages, you need to create more sitemap pages. add limited links or divide it to categorized site map. Then you can make a master site map, where you cal link all existing site map links, this will be easy for crawlers to crawl all links of your web page.
I have a different sitemap for each folder/subject. I have a non-blog site so I add links manually each time I add a new page. Having a separate siemap-foldea.xml, sitemap-folderb.xml etc.
This helps me with keeping control of my site and knowing what I am doing. I have a sitemap-index.xls in the root to link to the various folder sitemaps. It is a little help for the Googlebot as well.
I know this may scare some WordPress folk, (: but I like to know what is happening on my site and updating a sitemap and manually changing the date on the sitemap-index takes less than 30 seconds.
this reply took longer than that to put time in perspective.
G-Sitemap is to inform the site pages. Without sitemap also Google will find the pages but may take some time or some time chance of missing pages. So sitemap is very important and now Google sitemap allows 50k links in a single file
For 100K links you should use multiple xml sitemaps. Also remember if you disallow any URL through robots.txt then you should not include that URL in your xml sitemap. Otherwise it will effect the indexation of that particular URL. You can check this in Google Webmaster tool.
create sitemap.xml with all the links and if there are more links compress it and moreover you can create sitemap using www.xml-sitemaps.com