google indexed my website :(what did i do wrong?

Discussion in 'robots.txt' started by alarik, Jan 1, 2007.

  1. #1
    I was working on a website and i had all kind of stupid things in there,like everybody uploades when they are working directly on the FTP,as I like to do.
    I used robots.txt,a file placed in the root of the FTP,here is the exact content of my robots.txt :
    What did this happen?this is really sad:(
     
    alarik, Jan 1, 2007 IP
  2. fisher42uk

    fisher42uk Peon

    Messages:
    180
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Yous should have used

    User-agent: Googlebot
    Disallow: / # This website is temporary

    I believe, as you need to specify who exactly your are disallowing.
     
    fisher42uk, Jan 1, 2007 IP
  3. TechEvangelist

    TechEvangelist Guest

    Messages:
    919
    Likes Received:
    140
    Best Answers:
    0
    Trophy Points:
    133
    #3
    The original commnad would normally block Google as well.

    Google no longer reads the robots.txt file every time they visit. They only read it periodically. It's in one of Matt Cutts' blogs.

    None of the search engines have been real good at obeying the robots.txt file. I've seen all three of the big guys blow right past blocked subdirectories.

    The noindex meta tag on each page tends to work better if you want to block spiders.

    Google seems to be getting better at hitting new sites rather quickly. They started to index a couple of my sites with in a day or two of setting them up and before I set up any links to the site. That's not always a good thing, because I've also seen Google hit an incomplete site and then toss it into supplemental right away.

    Registrars must have a way of knowing when DNS changes are made to new sites. I suspect that triggers a spider visit. Google is a registrar. Either that, or they are tapped into the propagation updates for the DNS servers.
     
    TechEvangelist, Jan 1, 2007 IP
  4. alarik

    alarik Active Member

    Messages:
    382
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #4
    Okay,this is not good,i wasn't expecting this.So let's say that my website was about 25% complete,and i just abandonedit because it's a very big website and I can't do it right now because of personal issues and I want to get back to it in 2 months.My hosting will expire so there will not even be a website for this.what are the implications of this,how bad is this situation exacly?
     
    alarik, Jan 2, 2007 IP
  5. eTIME

    eTIME Banned

    Messages:
    128
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    use User-agent: * and then disallow all spiders and bots to index ur site.
     
    eTIME, Jan 29, 2007 IP
  6. jonbt

    jonbt Peon

    Messages:
    11
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6

    By doing
    User-agent: *
    Disallow: /
    Code (markup):
     
    jonbt, Jan 30, 2007 IP
  7. alarik

    alarik Active Member

    Messages:
    382
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #7
    this is getting confusing.that's exacly what i've done.
     
    alarik, Jan 31, 2007 IP