Is Robot.txt file work for only Google?

Discussion in 'Search Engine Optimization' started by Raj Prajapati, Jul 17, 2009.

  1. #1
    Hi DP Members,

    Today i visited this site and found it has robot.txt file - springflex.com/robots.txt

    due to this Google spider is not reachable to this site while this site is cached by yahoo and bing.

    Is robot.txt file work for only google? If yes, why? and if no, why?

    Please solve my problem.

    I am waiting for your solutions.

    Thanks
     
    Raj Prajapati, Jul 17, 2009 IP
  2. Raj Prajapati

    Raj Prajapati Active Member

    Messages:
    231
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    73
    #2
    I think, no body know about this. So sad!!!
     
    Raj Prajapati, Jul 17, 2009 IP
  3. googsmaster

    googsmaster Guest

    Messages:
    594
    Likes Received:
    28
    Best Answers:
    0
    Trophy Points:
    0
    #3
    googsmaster, Jul 17, 2009 IP
  4. affiliates4seo

    affiliates4seo Peon

    Messages:
    248
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Robots.txt is for all the bots, not only for any single search engine.

    Using this, u can restrict or allow any single bot or multiple bots as per your wish.

    Find the full information about robots.txt at: http://www.robotstxt.org/robotstxt.html

    Have a look of this ref.

    Even if you have any other queries after going through the above ref, post it here in the forum.
     
    affiliates4seo, Jul 17, 2009 IP
  5. stephen082

    stephen082 Active Member

    Messages:
    843
    Likes Received:
    81
    Best Answers:
    0
    Trophy Points:
    95
    #5
    Robot.txt file is used for all bots such as Google, Bing, Yahoo, Ask, Altavista as well as local search engines. Whenever a spider or crawler (it may be of any search engine) come to any site it first try to load Robot.txt file to find the specifications given.

    Robot.txt file has following format:

    User-Agent: *
    Disallow:

    Here "User-Agent" specify which bots must follow this rule. If it is given as "User-Agent: * " then it means that it is common for every search engine bot. If you want to specify any specific search engine you can give the name of that bot in this. For ex. to restrict Google to index your site you will use:

    "User-Agent: Googlebot"

    So Robot.txt file is applicable to each search engine.
     
    stephen082, Jul 17, 2009 IP
  6. Raj Prajapati

    Raj Prajapati Active Member

    Messages:
    231
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    73
    #6

    Thanks so much dear. Your advise too useful for me.
     
    Raj Prajapati, Jul 17, 2009 IP
  7. Raj Prajapati

    Raj Prajapati Active Member

    Messages:
    231
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    73
    #7

    Hi,

    Thanks for information. But this site used this file -
    User-Agent: *
    Disallow:

    Is it format disallow for only google? If no, why that site is being crawled by bing and yahoo?
     
    Raj Prajapati, Jul 17, 2009 IP
  8. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #8
    The robots.txt file:

    Says for all user agents... disallow nothing! In other words, this robots.txt file tells all of the search engines that they can index any page on the site. It ALLOWs everything to be indexed. It does NOT restrict the bots at all.

    If you want to block an entire site then you would use:

    If the site is not indexed at Google then 1) Google just hasn't crawled them, 2) Google crawled them but decided not to index them, or 3) they could be banned.

    I can tell you that the robots.txt file on this site is TOTALLY invalid. They have their User-agent: and Disallow: directive on the same line which is invalid. There is all kinds of error text showing up when I access it (looks like something to do with an ATTEMPT to call possibly a php program to build a sitemap from their robots.txt).

    I would highly suggest fixing the robots.txt. It might be that since the robots.txt is totally screwed up that Google has no clue what you are trying to block, so they are erroring on the safe side and not indexing anything.

    The User-agent directive should be on one line, the Disallow: directive should be on the next line followed by a blank line followed by your Sitemap: directive. Instead I get the following when I access their robots.txt:


     
    Canonical, Jul 17, 2009 IP
  9. Raj Prajapati

    Raj Prajapati Active Member

    Messages:
    231
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    73
    #9
    Hi Canonical,

    Thanks for information. You solved my problem.

    Thanks a lot again.


     
    Raj Prajapati, Jul 17, 2009 IP