Google Indexing My robots.txt Disallow Files (WTF)

Discussion in 'Google' started by tryingtolearn, Apr 1, 2009.

  1. #1
    Hello,

    I just completed coding on my website about 3 days ago, and uploaded it. I used Google Webmaster tools to submit my Sitemap. I have "Fresh" links to about four sites like aboutus.org, digg, and so on. I have my robots,txt file blocking things like privacy policy, terms of use, and some other little things that I don't want in the search engine. I looked today and Google just started indexing the pages I don't want in the search engine.

    They haven't even added my main.com name yet. How could this happen? Why start indexing pages like mysite.com/blahblah/terms-of-use, and not index my site.com? The other weird thing is I didn't even include these pages in my Sitemap.xml file.

    Any help/advice would be much appreaciated. I have my old/new robots.txt file below.

    ////This one I uploaded to Google when I submitted my Sitemap////

    User-Agent: *
    Disallow: /folder-like-this/privacy-policy/index.html
    Disallow: /folder-like-this/terms-of-use/index.html


    Sitemap: http://www.mysite.com/sitemap.xml



    ////This one is a more organized robots.txt I uploaded today. I seen the terms-of-use page indexed hours later////



    # robots.txt file Edit
    # Wed, 01 Apr 2009 06:36:55 +0000

    # Exclude Files From All Robots:

    User-agent: *
    Disallow: /folder-like-this/privacy-policy/index.html
    Disallow: /folder-like-this/terms-of-use/index.html
    Disallow: /missing.html
    Disallow: /welcome.html

    Sitemap: http://www.mysite.com/sitemap.xml

    # End robots.txt file


    Thanks for your time,
     
    tryingtolearn, Apr 1, 2009 IP
  2. xc06

    xc06 Notable Member

    Messages:
    3,498
    Likes Received:
    332
    Best Answers:
    0
    Trophy Points:
    203
    #2
    maybe google botted them before you just updated your robot.txt? just my guess.
     
    xc06, Apr 1, 2009 IP
  3. tryingtolearn

    tryingtolearn Peon

    Messages:
    99
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for the reply. The old one and the new one are basically the same except I added two more pages for them not to index. I uploaded the robots.txt file before I even submitted me site.
     
    tryingtolearn, Apr 1, 2009 IP
  4. tryingtolearn

    tryingtolearn Peon

    Messages:
    99
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Any other help would be appreciated.
     
    tryingtolearn, Apr 1, 2009 IP