[This post is split from another thread so please excuse the short intro - Ted S]
A client of mine recently contracted me to develop a Google scraper. Now before I go any further this is NOT a scraper that violates the Google TOS in that it does not access Google directly. It works off pages that a user has saved to their hard drive. It’s not really a Google scraper as much as an HTML scraper in that it parses HTML files (in this case Google SERP pages) and pulls out links but only those links that are to a give domain.
The script is intended to build a list of URL’s that are indexed at Google for a given site.
So once I obtain the SERP pages saved to a local hard drive for a given site I run the first of several scripts on those pages.
The end result is a Google Doc spreadsheet that has proven to be very useful to my SEO client and that might potentially prove useful enough to others to market it somehow.
I supposed you could call it an On-Site SEO Website Audit
It analysis a web site for On-Site SEO factors like…
- Is the HTML title present? What is it?
- Is there a meta description? What is it and is it too long?
- Are there Alt tags?
- Is the URL SEO and user friendly?
And other On-Site SEO factors.
Another script goes to each link found and accesses it with a HEAD request to determine what the HTTP access code is (i.e. 200 OK, 301 Redirect, etc.). If a redirect is found it catalogues the initial page being redirected to and adds that to the CSV file that I will use to populate the Google Doc spreadsheet.
Other scripts will pull out every link on each page, their anchor text, whether they are follow or no follow, what the Alt tags are, and other metrics and information that might prove useful.
The whole goal here is to clean up a site and bring it up to Google standards regarding On-Site SEO.
It’s not just all automated. Some of it will entail my personally going to pages and reviewing them and giving input on what could be improved SEO wise.
So that’s the product. A series of scripts that go out and automatically analyze the pages that are indexed in Google along with the addition of my personal analysis in addition to what the scripts find. All conveniently laid out in a Google Doc spreadsheet or provided in CSV format to allow for easy importation into Excel, Open Office, etc.
Now I can market this as a service and never give up the script’s source code (which will prevent the code from being reworked and stolen) or I can market the scripts themselves and make them available to others either for their own internal use in servicing their own clients or to sell the output to others themselves.
How I market this is still up in the air but I am leaning toward keeping the scripts private and just marketing the end reports and spreadsheets myself.
800 links takes me about 4 hours to analysis and do. Given that I want to make a minimum of $25 per hour that’s $100 that I must make to make this worthwhile for me to do as a service. I don’t know if that is doable, if there is enough value in this product to charge that much but…well…that’s why I am trying to get input. Some input that I did get on another forum has led me to believe that I am way underpricing this…which is possible I supposed though it’s tough for me to see paying even $100 for this kind of detailed analysis. I mean $100 IS peanuts if the detailed analysis actually helps a website owner clean up their site and improve their rankings I guess which I think it will but again psychologically I have to get over the fact that I would not pay that myself…I would just write the code and do it for free LOL.
Any and all input is most welcome if anyone has anything to say about all this.
Still investigating whether this is even doable and profitable to market outside the boundaries of using this to provide useful information to my own, very limited set of clients.