web application for document plagiarism by google api
i am trying to create a web based application with php to make a document plagiarism.
by this web application we can upload to it a word file (which contain : article or technical report), this application will use a google web services to check the document content with all online document and indexed web sites content to see if there is a copy and paste in this document this copy and paste called document plagiarism.
i need the name for this google service and php sample code also to do this.
You could search the first line or two in Google, scrape the links of the first 10 results, load them, take a certain amount of text (sans HTML tags) from the <body> (maybe from the 200th character to the 100th last character), then use levenshtien() to compare it with your document. The threshold value for plagiarism would need to be quite high though and you will need to moderate it at first.