Robots.txt vs <meta name="robots" content="noindex,nofollow">

gravy834 Peon

Messages:: 28

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

Hi,

If I specify within my robots.txt file to disallow specific pages do I still need to include <meta name="robots" content="noindex,nofollow"> on each of those pages?

Thanks

gravy834, Mar 2, 2009 IP

wp-themes Banned

Messages:: 230

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 0

#2

Not really required, as Robots.txt rules for allowing / disallowing indexing are the most important ones...

However, you need to make sure to use it wisely, otherwise you might get important pages/folders deindexed

wp-themes, Mar 2, 2009 IP

Lpe04 Peon

Messages:: 579

Likes Received:: 15

Best Answers:: 0

Trophy Points:: 0

#3

You can, but it's probably not necessary (but will definitly issure that it doesn't get indexed).

You can also maybe try it if there is a page that you want removed from an index.

Lpe04, Mar 3, 2009 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#4

gravy834 said: ↑

Hi,

If I specify within my robots.txt file to disallow specific pages do I still need to include <meta name="robots" content="noindex,nofollow"> on each of those pages?

Thanks
Click to expand...

No need to add, when you already added it in robots.txt..

manish.chauhan, Mar 4, 2009 IP

shailendra Peon

Messages:: 1,225

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#5

gravy834 said: ↑

Hi,

If I specify within my robots.txt file to disallow specific pages do I still need to include <meta name="robots" content="noindex,nofollow"> on each of those pages?

Thanks
Click to expand...

it's better to use robots.txt file to block the pages from getting crawled. moreover, you should always try to keep the coding as mow as possible to prevent code bloating

shailendra, Mar 6, 2009 IP

linkmonkey Peon

Messages:: 68

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#6

No need if it's already in robots.txt

linkmonkey, Mar 10, 2009 IP

meri0098 Peon

Messages:: 36

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

We are developing a portal, for that our development team has made 3 or 4 sub-folders on the same server for its backup and testing purpose. Google is considering these folders as a sub-sites and indexing all of them.

Today I have disallowed all these folder or sub-sites with the help of Robots.txt file.

In which I have used Following code

User-agent: *
Disallow: /

User-agent: Googlebot
Noindex: /

in this way I think search engine crawlers will not index these sub-folders.

We are also using Meta Tag <meta name="robots" content="index, follow" />
in site, I cant change it in subfolders for disallowing because developer does all changes in these folders, they can upload in to site.

My question is I have disallowed sub-folder by robot.txt file but there is meta tag <meta name="robots" content="index, follow" /> which is saying to follow and index the content.

Should I remove follow meta tags from all of them?
One is saying for follow and one is disallowing it? I am totally confuse what to do.

meri0098, Jan 15, 2011 IP

tenners Peon

Messages:: 30

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#8

@ gravy834 - Both the robots.txt file and the <meta name="robots" tag are used to control the indexing and caching of your website's pages. If you already stated NOT to index a page in the robots.txt file it is not necessary to do so on the page with the meta tag. However, keep in mind that not all spiders are created equal...meaning, they don't all use or follow your robots.txt directives so in my humble opinion it is still good to utilize the <meta "robots" tag even though you have explicitly stated not to index a page in your robots.txt file. Consider also, the scenario in which a spider gets to your page via a link that someone else put to it directly...will the spider index that content? (who knows for sure)...besides, it's not that much code that you should be too concerned about it's "weight" on the page.

@ meri0098, if you consider what I've said above, the set-up you have seems like it could potentially cause a problem for you. I would find a way to have your directives in sync.

tenners, Jan 16, 2011 IP

Backlinkshub Peon

Messages:: 35

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#9

First of all we must put "robot.txt" at the top-level directory of our web server.

And the second one

When a robot looks for the "/robots.txt" file for any URL, it takes the path component from the URL (Everything from the first single slash), and puts "/robots.txt" in its place.

For example, for "http://www.ABC.com/designs/index.html, it will remove the "/designs/index.html", and replace it with "/robots.txt", and will end up with "http://www.ABC.com/robots.txt".

So i thing there is no need to again specify robot tag in every page coz whenever spider comes to any of the page of our website first of all it directly goes to "robot.txt" then after goes to that particular page which we request .

Backlinkshub, Jan 17, 2011 IP

calvin4u Peon

Messages:: 40

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#10

Thank You..
use for this robot .txt metta tag all inerpage site...

calvin4u, Jan 20, 2011 IP

hilhilginger Well-Known Member

Messages:: 322

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 103

#11

shailendra said: ↑

it's better to use robots.txt file to block the pages from getting crawled. moreover, you should always try to keep the coding as mow as possible to prevent code bloating
Click to expand...

Well.Thanks for the info.I heard that bing is taking site info from DMOZ and not from the robot.text. So if my site is listed in DMOZ then there is no point in using robot.text.

hilhilginger, Jan 20, 2011 IP

brad.smith4321 Peon

Messages:: 249

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#12

robots.txt file to block the pages from getting crawled. furthermore, you should always try to keep the coding as mow as possible to prevent code bloating

brad.smith4321, Feb 1, 2011 IP

fsdnetwork Peon

Messages:: 20

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#13

You should remove the meta tag info (it's redundant), so these pages folders are currntly blocked by robots and no one crawler ( robots.txt compliant ) could crawl these pages or folders

fsdnetwork, Feb 9, 2011 IP

Log in or Sign up

Robots.txt vs <meta name="robots" content="noindex,nofollow">

gravy834 Peon

wp-themes Banned

Lpe04 Peon

manish.chauhan Well-Known Member

shailendra Peon

linkmonkey Peon

meri0098 Peon

tenners Peon

Backlinkshub Peon

calvin4u Peon

hilhilginger Well-Known Member

brad.smith4321 Peon

fsdnetwork Peon

Useful Searches