I'm looking for sites who are interested in using/testing this service to work out any kinks and potential issues as well as to give feedback that we can pass along to amazon.com/a9.com.
Especially interested in making sure that all elements are clean, data is good, response is quick and such (summaries compiled from spidered pages are hard to cleanup sometimes, and that can be an issue with xml data..). We will be moving to a servlet and making some updates (requiring referr ip's for validation and such).
No, they're spidered by our robot.. We are working on indexing about a million sites a day right now. The index that is up right now will be refreshed over the weekend to fix some of the summary data as well as some tweaks we have been testing.
We seeded our database from dmoz.org data so we would have a good starting point to crawl from. We generally follow up to 100 outbound links per page so we are moving across sites quickly.
Bookmarks