Robots.txt question

Moneyfolk Peon

Messages:: 420

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#1

I created a basic robots.txt file. Do I need to also include

<meta name="ROBOTS" content="ALL"> in the head tags of my html pages or is just having the robots.txt in my root directory is enough.

Thanks in advance.

Moneyfolk, Jan 12, 2006 IP

Smyrl Tomato Republic Staff

Messages:: 13,740

Likes Received:: 1,702

Best Answers:: 78

Trophy Points:: 510

#2

Robots.txt file sufficient. Actually if you had neither the robots.txt or the meta tag mentioned above entire site would be available for indexing. I use robots.txt more to disallow certain folders for compliant robots. Rogues will not honor robots.txt.

Shannon

Smyrl, Jan 12, 2006 IP

dcristo Illustrious Member

Messages:: 19,800

Likes Received:: 1,202

Best Answers:: 7

Trophy Points:: 470

Articles:: 5

#3

Moneyfolk said:

I created a basic robots.txt file. Do I need to also include

<meta name="ROBOTS" content="ALL"> in the head tags of my html pages or is just having the robots.txt in my root directory is enough.

Thanks in advance.
Click to expand...

It's only necessary to have the Meta Title, Keyword, and Description Tags, the rest are not required.

dcristo, Jan 12, 2006 IP

Moneyfolk Peon

Messages:: 420

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#4

Thank you, both. I hope having the file will be conducive to the big 3 indexing more of my pages. I understand that MSN likes robots.txt files.

Moneyfolk, Jan 12, 2006 IP

dcristo Illustrious Member

Messages:: 19,800

Likes Received:: 1,202

Best Answers:: 7

Trophy Points:: 470

Articles:: 5

#5

Moneyfolk said:

Thank you, both. I hope having the file will be conducive to the big 3 indexing more of my pages. I understand that MSN likes robots.txt files.
Click to expand...

Getting well indexed in the SE's is just a matter of getting more links to your site. The robots.txt is typically used to tell the SE's NOT to index parts of your site.

dcristo, Jan 12, 2006 IP

mdvaldosta Peon

Messages:: 4,079

Likes Received:: 362

Best Answers:: 0

Trophy Points:: 0

#6

dcristo said:

It's only necessary to have the Meta Title, Keyword, and Description Tags, the rest are not required.
Click to expand...

Actually, it's only NECESSARY to have the title tag, the meta description is optional (the SE's will skim your page and pull a description for you) but highly recommended you have your own. The keyword is usually a waste of time, but I still use it anyways for good form.

The robots.txt is especially important for MSN, even if you upload a blank one. Also, for awstats because hits on that file is one of the ways it recognizes bot hits.

mdvaldosta, Jan 12, 2006 IP

dcristo Illustrious Member

Messages:: 19,800

Likes Received:: 1,202

Best Answers:: 7

Trophy Points:: 470

Articles:: 5

#7

To clarify, when I stated it were necessary, I really meant it's advised to include them.

As for robots.txt and MSN, I havent had any problems with any sites on MSN when excluding the file.

dcristo, Jan 12, 2006 IP

seo_expert Well-Known Member

Messages:: 475

Likes Received:: 12

Best Answers:: 0

Trophy Points:: 123

#8

I'd like to add one thing here.....

be aware of websites while link exchanging as some web masters use robots.txt file to dis-allow the Google spiders to crawl their link pages..your link will be of no use then...

seo_expert, Jan 12, 2006 IP

Moneyfolk Peon

Messages:: 420

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#9

So should you check the robots.txt file of websites that you want to link to?

Moneyfolk, Jan 13, 2006 IP

northpointaiki Guest

Messages:: 6,876

Likes Received:: 187

Best Answers:: 0

Trophy Points:: 0

#10

seo_expert said:

I'd like to add one thing here.....

be aware of websites while link exchanging as some web masters use robots.txt file to dis-allow the Google spiders to crawl their link pages..your link will be of no use then...
Click to expand...

Hadn't thought of that - it's a good point. Unlike "nofollow," which can be detected, yes, how would you know what's in their robots.txt?

northpointaiki, Jan 13, 2006 IP

BILZ Peon

Messages:: 1,515

Likes Received:: 62

Best Answers:: 0

Trophy Points:: 0

#11

just look at the file... theirdomain.com/robots.txt

Without checking the source, How do you detect a nofollow?

BILZ, Jan 13, 2006 IP

maldives Prominent Member

Messages:: 7,187

Likes Received:: 902

Best Answers:: 0

Trophy Points:: 310

#12

mdvaldosta said:

Actually, it's only NECESSARY to have the title tag, the meta description is optional (the SE's will skim your page and pull a description for you) but highly recommended you have your own. The keyword is usually a waste of time, but I still use it anyways for good form.

The robots.txt is especially important for MSN, even if you upload a blank one. Also, for awstats because hits on that file is one of the ways it recognizes bot hits.
Click to expand...

Excellent! In most cases I use robots.txt more to disallow certain folders for robots. MSNBot complies with the standards for robots.txt.

maldives, Jan 13, 2006 IP

Moneyfolk Peon

Messages:: 420

Likes Received:: 11

Best Answers:: 0

Trophy Points:: 0

#13

You can to Sitename/robots.txt so for www.w3.org its:

http://www.w3.org/robots.txt

Moneyfolk, Jan 13, 2006 IP

northpointaiki Guest

Messages:: 6,876

Likes Received:: 187

Best Answers:: 0

Trophy Points:: 0

#14

Yeah, just saw this today: put robots.txt and then look for the disallow on the directory or page you are linked on:

User-agent: Googlebot
Disallow: /TheDirectoryorpageyou'reon/

Thanks.

northpointaiki, Jan 13, 2006 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#15

mdvaldosta said:

The robots.txt is especially important for MSN, even if you upload a blank one.
Click to expand...

Why would it be more important for MSN than for Google or Yahoo ? If robots.txt is not present, they all will understand that they are permitted to visit all pages.

mdvaldosta said:

Also, for awstats because hits on that file is one of the ways it recognizes bot hits.
Click to expand...

When the file is not present, AWStats sees the requests for the non-existing robots.txt file. These requests allow AWStats to recognize these bots.

Jean-Luc

Jean-Luc, Jan 13, 2006 IP

ServerUnion Peon

Messages:: 3,611

Likes Received:: 296

Best Answers:: 0

Trophy Points:: 0

#16

Jean-Luc said:

When the file is not present, AWStats sees the requests for the non-existing robots.txt file. These requests allow AWStats to recognize these bots.
Click to expand...

This is not the way to identify bots, all it will do it fill up your 404 error section. The SE's dont download the robots.txt every visit as this would be a waste of resources and would nullify your idea.

ServerUnion, Jan 13, 2006 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#17

ServerUnion said:

This is not the way to identify bots, all it will do it fill up your 404 error section.
Click to expand...

I agree that it is not the best way to identify bots and it should certainly not be the only way to do it, but it is used by AWStats and other stats software to discover new bots. AWStats reports them as
Unknown robot (identified by hit on 'robots.txt')
Code (markup):
Jean-Luc

Jean-Luc, Jan 13, 2006 IP

ServerUnion Peon

Messages:: 3,611

Likes Received:: 296

Best Answers:: 0

Trophy Points:: 0

#18

No, that simply means that the bot does not have an official listing as a verified source. It still knows it is a bot, just doesn't know the name. Could be many reasons for this, the robots.txt file has nothing to do with it.

I get these with sites I have the file on, and ones I do not. Can you provide documenation on this theory? I would be interested to read more about it.

ServerUnion, Jan 13, 2006 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#19

A line, titled "Unknown robot (identified by hit on 'robots.txt')", appears in the AWStats list of "Robots/Spiders visitors". That seems pretty clear to me.

Jean-Luc

Jean-Luc, Jan 13, 2006 IP

ServerUnion Peon

Messages:: 3,611

Likes Received:: 296

Best Answers:: 0

Trophy Points:: 0

#20

opposed to have hundreds of 404 errors on the file?

This may just be due to the fact that the stats programs aren't going to waste overhead by listing out ever little bot that stops by. Most likely just list the larger sources.

ServerUnion, Jan 13, 2006 IP

Log in or Sign up

Robots.txt question

Moneyfolk Peon

Smyrl Tomato Republic Staff

dcristo Illustrious Member

Moneyfolk Peon

dcristo Illustrious Member

mdvaldosta Peon

dcristo Illustrious Member

seo_expert Well-Known Member

Moneyfolk Peon

northpointaiki Guest

BILZ Peon

maldives Prominent Member

Moneyfolk Peon

northpointaiki Guest

Jean-Luc Peon

ServerUnion Peon

Jean-Luc Peon

ServerUnion Peon

Jean-Luc Peon

ServerUnion Peon

Useful Searches