I've just started using sitemap using this tool and submitted it to google sitemap 2 days ago. I've only seen crawl errors, how can we know what pages has been crawled by googlebot?
A lot of log analysers will tell you which pages have been requested by the various spiders, and how many times they were retrieved
Use this tool: http://net-promoter.com/robots-txt/ It can be a bit time consuming if you have a lot of pages crawled (>100,000)
Check with your hosting provider to see if you can get the raw access-log for your website. Everything from ip addresses 66.249.*.* is google.
I was looking at my logs recently, and got all excited that a google IP had been requesting pages from my site. I think maybe it just had to do with adsense, since it tends to happen when I'm browsing my own site. Do the adsense scripts trigger a page hit from google?
While on this topic, is there any way to get google to crawl a site again? I understand that if I had any sort of rank out there I'd be crawled again more frequently. But alas, I have no rank at all yet, and the pages google currently has indexed don't even exist anymore. Not that it matters since I don't show up on any searches.
Yes, google will go to the same pages as your human visitors to see which ads are relevant to that page. Usually you can tell this by looking at the UserAgent in your log files. Adsense will usually show up as MediaPartners-Google/2.1 or something like that.