Hello, I just completed coding on my website about 3 days ago, and uploaded it. I used Google Webmaster tools to submit my Sitemap. I have "Fresh" links to about four sites like aboutus.org, digg, and so on. I have my robots,txt file blocking things like privacy policy, terms of use, and some other little things that I don't want in the search engine. I looked today and Google just started indexing the pages I don't want in the search engine. They haven't even added my main.com name yet. How could this happen? Why start indexing pages like mysite.com/blahblah/terms-of-use, and not index my site.com? The other weird thing is I didn't even include these pages in my Sitemap.xml file. Any help/advice would be much appreaciated. I have my old/new robots.txt file below. ////This one I uploaded to Google when I submitted my Sitemap//// User-Agent: * Disallow: /folder-like-this/privacy-policy/index.html Disallow: /folder-like-this/terms-of-use/index.html Sitemap: http://www.mysite.com/sitemap.xml ////This one is a more organized robots.txt I uploaded today. I seen the terms-of-use page indexed hours later//// # robots.txt file Edit # Wed, 01 Apr 2009 06:36:55 +0000 # Exclude Files From All Robots: User-agent: * Disallow: /folder-like-this/privacy-policy/index.html Disallow: /folder-like-this/terms-of-use/index.html Disallow: /missing.html Disallow: /welcome.html Sitemap: http://www.mysite.com/sitemap.xml # End robots.txt file Thanks for your time,
Thanks for the reply. The old one and the new one are basically the same except I added two more pages for them not to index. I uploaded the robots.txt file before I even submitted me site.