what is this?

Discussion in 'Google' started by i_am_dhaval, Aug 8, 2009.

  1. #1
    is i open this then see this result

    http://www.google.com.com/robots.txt

    User-agent: *
    Disallow: /click
    Disallow: /reference/Astro_(satellite_TV)
    Disallow: /reference/Astro_(Satellite_TV)
    Disallow: /reference/Satellite_TV
    Disallow: /reference/Satellite_tv


    any url i type at the and i add .com extra then i see this

    is only i see this or you all also see this
     
    i_am_dhaval, Aug 8, 2009 IP
  2. DoDo Me

    DoDo Me Peon

    Messages:
    2,257
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #2
    It's belong to domain www.com.com, not belongs to google
     
    DoDo Me, Aug 8, 2009 IP
  3. Abhik

    Abhik ..:: The ONE ::..

    Messages:
    11,337
    Likes Received:
    606
    Best Answers:
    0
    Trophy Points:
    410
    Digital Goods:
    2
    #3
    Its a subdomain of com.com and they wanna block those URLs from spiders.
     
    Abhik, Aug 9, 2009 IP
  4. grangonzo

    grangonzo Peon

    Messages:
    94
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    It's blocking those bots from crawling the site via robots.txt
     
    grangonzo, Aug 9, 2009 IP
  5. redhits

    redhits Notable Member

    Messages:
    3,023
    Likes Received:
    277
    Best Answers:
    0
    Trophy Points:
    255
    #5
    That robot.txt it's to tell spiders what they are not allowed to crawl.
    In this case, yahoo/msn/google, is "restricted" to crawl into those "sub"directories(sub-pages)
     
    redhits, Aug 9, 2009 IP