How can you tell a site is blocking the bots?

coopersPick Active Member

Messages:: 528

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 55

#1

I am pretty sure its .robot txt that stops a site from being indexed but wanted to make sure and if it is how can I look at a source code to see if they are blocking a bot from indexing or is that not possible?

coopersPick, Jul 5, 2010 IP

magda Notable Member

Messages:: 5,197

Likes Received:: 315

Best Answers:: 0

Trophy Points:: 280

#2

Just look at their robots txt - www.whatevertheirdomainnameis.com/robots.txt

Alternatively, they might have put is as a meta-tag - look for "noindex" in the source

magda, Jul 5, 2010 IP

coopersPick Active Member

Messages:: 528

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 55

#3

dont get it so it should be in the url? or should I be looking for something in the source code?

coopersPick, Jul 5, 2010 IP

SEMSpot Peon

Messages:: 513

Likes Received:: 25

Best Answers:: 0

Trophy Points:: 0

#4

Lets say you want to see the robots.txt file of widget.com you would simply go to www.widget.com/robots.txt

If they are allowing everything, then look in the meta data (should be at the top) within the source code and see if they are blocking anything from there.

SEMSpot, Jul 5, 2010 IP

Grimm Peon

Messages:: 3,072

Likes Received:: 57

Best Answers:: 0

Trophy Points:: 0

#5

Check nofollow tags on links as well. It blocks robots from crawling the link that page will most likely not get indexed unless having inbound links from other websites or pages that is not using nofollow tags.

Grimm, Jul 5, 2010 IP

coopersPick Active Member

Messages:: 528

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 55

#6

still a little confused can I pull up the source code and look at that and see a .robts txt in there or no?

coopersPick, Jul 8, 2010 IP

Grimm Peon

Messages:: 3,072

Likes Received:: 57

Best Answers:: 0

Trophy Points:: 0

#7

coopersPick said: ↑

still a little confused can I pull up the source code and look at that and see a .robts txt in there or no?
Click to expand...

You can also try typing /robots.txt directly on your browser.

Ex. http://www.example.com/robots.txt

Grimm, Jul 8, 2010 IP

coopersPick Active Member

Messages:: 528

Likes Received:: 2

Best Answers:: 0

Trophy Points:: 55

#8

and what do I need to look for once I type that in to the domain?

coopersPick, Jul 8, 2010 IP

Grimm Peon

Messages:: 3,072

Likes Received:: 57

Best Answers:: 0

Trophy Points:: 0

#9

coopersPick said: ↑

and what do I need to look for once I type that in to the domain?
Click to expand...

This can help you a lot, check this Google webmaster support information.

Just watch out for this type of robots.txt files as they are meant to block any crawlers from crawling your website.
User-agent: *
Disallow: /
Code (markup):

Grimm, Jul 8, 2010 IP

mvpsandeep Active Member

Messages:: 113

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 53

#10

<meta name="robots" content="noindex, nofollow" />

mvpsandeep, Jul 8, 2010 IP

dorthyjoseph Guest

Messages:: 50

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#11

mvpsandeep said: ↑

<meta name="robots" content="noindex, nofollow" />
Click to expand...

This is perfect....

dorthyjoseph, Jul 9, 2010 IP

earnincome Peon

Messages:: 724

Likes Received:: 6

Best Answers:: 0

Trophy Points:: 0

#12

magda said: ↑

Just look at their robots txt - www.whatevertheirdomainnameis.com/robots.txt

Alternatively, they might have put is as a meta-tag - look for "noindex" in the source
Click to expand...

You are absolutely right, it is the only way to check robots.txt and content in it.

earnincome, Jul 9, 2010 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#13

As robots.txt is placed in the root folder, you can easily check your robots.txt file at yourdomain.com/robots.txt

manish.chauhan, Jul 9, 2010 IP

Log in or Sign up

How can you tell a site is blocking the bots?

coopersPick Active Member

magda Notable Member

coopersPick Active Member

SEMSpot Peon

Grimm Peon

coopersPick Active Member

Grimm Peon

coopersPick Active Member

Grimm Peon

mvpsandeep Active Member

dorthyjoseph Guest

earnincome Peon

manish.chauhan Well-Known Member

Useful Searches