Is Robot.txt file work for only Google?

Raj Prajapati Active Member

Messages:: 231

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 73

#1

Hi DP Members,

Today i visited this site and found it has robot.txt file - springflex.com/robots.txt

due to this Google spider is not reachable to this site while this site is cached by yahoo and bing.

Is robot.txt file work for only google? If yes, why? and if no, why?

Please solve my problem.

I am waiting for your solutions.

Thanks

Raj Prajapati, Jul 17, 2009 IP

Raj Prajapati Active Member

Messages:: 231

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 73

#2

I think, no body know about this. So sad!!!

Raj Prajapati, Jul 17, 2009 IP

googsmaster Guest

Messages:: 594

Likes Received:: 28

Best Answers:: 0

Trophy Points:: 0

#3

robots.txt file is used by many bots like alexa, yahoo, msn etc etc
not just Google.
check out this page : http://www.mcanerin.com/en/search-engine/robots-txt.asp
its robots.txt generator. awesome tool. will also tell u who follow that file rules.

googsmaster, Jul 17, 2009 IP

affiliates4seo Peon

Messages:: 248

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#4

Robots.txt is for all the bots, not only for any single search engine.

Using this, u can restrict or allow any single bot or multiple bots as per your wish.

Find the full information about robots.txt at: http://www.robotstxt.org/robotstxt.html

Have a look of this ref.

Even if you have any other queries after going through the above ref, post it here in the forum.

affiliates4seo, Jul 17, 2009 IP

stephen082 Active Member

Messages:: 843

Likes Received:: 81

Best Answers:: 0

Trophy Points:: 95

#5

Robot.txt file is used for all bots such as Google, Bing, Yahoo, Ask, Altavista as well as local search engines. Whenever a spider or crawler (it may be of any search engine) come to any site it first try to load Robot.txt file to find the specifications given.

Robot.txt file has following format:

User-Agent: *
Disallow:

Here "User-Agent" specify which bots must follow this rule. If it is given as "User-Agent: * " then it means that it is common for every search engine bot. If you want to specify any specific search engine you can give the name of that bot in this. For ex. to restrict Google to index your site you will use:

"User-Agent: Googlebot"

So Robot.txt file is applicable to each search engine.

stephen082, Jul 17, 2009 IP

Raj Prajapati Active Member

Messages:: 231

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 73

#6

googsmaster said: ↑

robots.txt file is used by many bots like alexa, yahoo, msn etc etc
not just Google.
check out this page : http://www.mcanerin.com/en/search-engine/robots-txt.asp
its robots.txt generator. awesome tool. will also tell u who follow that file rules.
Click to expand...

Thanks so much dear. Your advise too useful for me.

Raj Prajapati, Jul 17, 2009 IP

Raj Prajapati Active Member

Messages:: 231

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 73

#7

stephen082 said: ↑

Robot.txt file is used for all bots such as Google, Bing, Yahoo, Ask, Altavista as well as local search engines. Whenever a spider or crawler (it may be of any search engine) come to any site it first try to load Robot.txt file to find the specifications given.

Robot.txt file has following format:

User-Agent: *
Disallow:

Here "User-Agent" specify which bots must follow this rule. If it is given as "User-Agent: * " then it means that it is common for every search engine bot. If you want to specify any specific search engine you can give the name of that bot in this. For ex. to restrict Google to index your site you will use:

"User-Agent: Googlebot"

So Robot.txt file is applicable to each search engine.
Click to expand...

Hi,

Thanks for information. But this site used this file -
User-Agent: *
Disallow:

Is it format disallow for only google? If no, why that site is being crawled by bing and yahoo?

Raj Prajapati, Jul 17, 2009 IP

Canonical Well-Known Member

Messages:: 2,223

Likes Received:: 141

Best Answers:: 0

Trophy Points:: 110

#8

The robots.txt file:

User-Agent: *
Disallow:
Click to expand...

Says for all user agents... disallow nothing! In other words, this robots.txt file tells all of the search engines that they can index any page on the site. It ALLOWs everything to be indexed. It does NOT restrict the bots at all.

If you want to block an entire site then you would use:

User-Agent: *
Disallow: /
Click to expand...

If the site is not indexed at Google then 1) Google just hasn't crawled them, 2) Google crawled them but decided not to index them, or 3) they could be banned.

I can tell you that the robots.txt file on this site is TOTALLY invalid. They have their User-agent: and Disallow: directive on the same line which is invalid. There is all kinds of error text showing up when I access it (looks like something to do with an ATTEMPT to call possibly a php program to build a sitemap from their robots.txt).

I would highly suggest fixing the robots.txt. It might be that since the robots.txt is totally screwed up that Google has no clue what you are trying to block, so they are erroring on the safe side and not indexing anything.

The User-agent directive should be on one line, the Disallow: directive should be on the next line followed by a blank line followed by your Sitemap: directive. Instead I get the following when I access their robots.txt:

User-agent: * Disallow:
Warning: fopen(/home/springfl/public_html/wp-content/sitemaps/17925949564a60aae8d4691) [function.fopen]: failed to open stream: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 718

Warning: fwrite(): supplied argument is not a valid stream resource in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 743

Warning: fclose(): supplied argument is not a valid stream resource in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 745

Warning: file_get_contents(/home/springfl/public_html/wp-content/sitemaps/17925949564a60aae8d4691) [function.file-get-contents]: failed to open stream: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 752

Warning: fopen(/home/springfl/public_html/wp-content/sitemaps/17925949564a60aae8d4691.gz) [function.fopen]: failed to open stream: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 753

Warning: fwrite(): supplied argument is not a valid stream resource in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 754

Warning: fclose(): supplied argument is not a valid stream resource in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 755

Warning: rename(/home/springfl/public_html/wp-content/sitemaps/17925949564a60aae8d4691,/home/springfl/public_html/wp-content/sitemaps/sitemap.xml) [function.rename]: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 770

Warning: chmod() [function.chmod]: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 771

Warning: rename(/home/springfl/public_html/wp-content/sitemaps/17925949564a60aae8d4691.gz,/home/springfl/public_html/wp-content/sitemaps/sitemap.xml.gz) [function.rename]: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 775

Warning: chmod() [function.chmod]: No such file or directory in /home/springfl/public_html/wp-content/plugins/xml-sitemaps/xml-sitemaps-utils.php on line 776
Click to expand...

Canonical, Jul 17, 2009 IP

Raj Prajapati Active Member

Messages:: 231

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 73

#9

Hi Canonical,

Thanks for information. You solved my problem.

Thanks a lot again.

Canonical said: ↑

The robots.txt file:

Says for all user agents... disallow nothing! In other words, this robots.txt file tells all of the search engines that they can index any page on the site. It ALLOWs everything to be indexed. It does NOT restrict the bots at all.

If you want to block an entire site then you would use:

If the site is not indexed at Google then 1) Google just hasn't crawled them, 2) Google crawled them but decided not to index them, or 3) they could be banned.

I can tell you that the robots.txt file on this site is TOTALLY invalid. They have their User-agent: and Disallow: directive on the same line which is invalid. There is all kinds of error text showing up when I access it (looks like something to do with an ATTEMPT to call possibly a php program to build a sitemap from their robots.txt).

I would highly suggest fixing the robots.txt. It might be that since the robots.txt is totally screwed up that Google has no clue what you are trying to block, so they are erroring on the safe side and not indexing anything.

The User-agent directive should be on one line, the Disallow: directive should be on the next line followed by a blank line followed by your Sitemap: directive. Instead I get the following when I access their robots.txt:
Click to expand...

Raj Prajapati, Jul 17, 2009 IP

Log in or Sign up

Is Robot.txt file work for only Google?

Raj Prajapati Active Member

Raj Prajapati Active Member

googsmaster Guest

affiliates4seo Peon

stephen082 Active Member

Raj Prajapati Active Member

Raj Prajapati Active Member

Canonical Well-Known Member

Raj Prajapati Active Member

Useful Searches