I want to make a php script that show the page that are not indexed in google. The script get the links from google then compare with the links in the website sitemap, make the difference and tell what pages are not indexed. I am searching for a tutorial, help about google api (get results and parsing from google results)
I have a method that just scrapes normal Google results on a cron schedule (no API). It gets pretty sketchy if you have a site with 100k's of pages but it will eventually get done checking all of them. If your too lazy to make your own script, I may be willing to part with it if your price is right
I don't have a very big site, I want to make my own tools, and I am searching for a tutorial or something similar.
Google is very hard to script. If you are going to do this, you can expect your IP address to get banned from google. This has happened to me several times.
Well, i was a little more aggressive than those scripts (ie: indexing every site result for keyword "x", then indexing link:X, then site:X). I was accessing about 15 instances of google chewing away on this, and got temp-banned in about 35 minutes, so I stopped. Found other sources for my data (yahoo mostly).