1. Ok, so let's say that I create a sitemap and its an onsite sitemap. Now let's say that I create an xml sitemap for google's eyes only. Lets pretend that I have like 30 links in that xml sitemap, and that 3 of those links are in the robots.txt file as Disallow. I'm starting to think that this confuses google. Should I leave links in robots.txt out of my xml sitemap? 2. This is a stupid one I'm afraid - Is is common practice to update the xml sitemap every time a new page is added to a site, and resubmit it to google? that's what I've been doing. Is this normal practice to inform google of new pages?
If you want to disallow a page then don't include it in your XML sitemap. Only pages you want crawled should be in your XML sitemap. You should add any new pages to your XML sitemap but you dont need to "resubmit". If you are hosting your XML sitemap on your server and referencing it from your robots.txt file then it will be picked up automatically by Google, Yahoo, and MSN. Hope that helps!
Right...the XML map is only the pages you want Google to index and crawl. Once you have a sitemap and tell Google where to find it in your webmaster tools....then Google will crawl it everytime they come to your site....once a day/week/month. So resubmitting is a waste of your time.
Thanks guys - so I should include it in my robots file as well? like Allow: sitemap.xml? I did not know that, I never even thought about it - I feel so stupid now, oh well thanks for the great advice - everyone's always so helpful - take care.
To reference your sitemap from your Robots.txt file put this line in your Robots file Sitemap: http://www.example.com/sitemap.xml