About the search engine robot.txt file and SEO!

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#1

Hi,

I have just found that using the following shouldn't be done:

User-agent: *
Disallow:

It is a robots.txt file that allows everything, but apparently some search engines may miss read into thinking that it is banning every robot.

Is this true as I have this robots.txt file like the above in alot of my directories.

I read about it here: http://www.seoconsultants.com/robots-text-file/#not-recommended

So you think that instead of using a robots.txt file with

User-agent: *
Disallow:

in it, then I might aswell just not have a robots.txt file. Could some robots/ serch engines thing I don't want to them crawl my site using

User-agent: *
Disallow:

Please advise me on this as I really want to know.

Thanks!

john269, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#2

Correct. If you aren't going to disallow anything, just use a blank robots.txt (to avoid 404) or none at all. No need to risk anything as you say.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#3

I have just looked through Yahoo and MSN and it looks as if they have not crawled my site properly, especially Yahoo.

It looks as if alot of the information in Yahoo is old stuff.

So using a blank robot.txt file is ok then. I prefer to use something so that I don't have the 404 error all the time.

john269, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#4

Consider using Google Sitemaps and Yahoo Feeds. Robots.txt is primarily to tell them what NOT to do. Sitemaps can guide them to where you want them.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#5

Here I go again!

I have a product search engine where my main aim is to sell products for merchants and not to give their site PR.

Well, I use a click.php script so that I can keep a track of the clicks and also so that it can then redirect to the merchants site. I have found that this clicks.php file for every product is being indexed in the search engines, which could be bad as as soon as someone clicks the listing it redirects to the merchants site straight away, which may make it look like a doorway page or something.

Well anyway, I am probably also leaking alot of PR to these merchant sites as the search engines are crawling it and passing pr to the merchants.

I was thinking that if I use a robots.txt file to stop the search engines from crawling the clicks.php file then it will not be listed in the search engines plus will it also mean that I will not loose any PR as the robots will not follow the links to the merchants site?

john269, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#6

There's a difference between not being crawled and actually showing up in the listings. When you block something in robots.txt doesn't mean the SE will deny existance of that link. It just won't use its content.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#7

Is there anyway I can block the existance of that link then to stop them gaining PR. I need to stop them again pr as I am promoting their products and it is not meant for them to gain pr, but for me to sell their products.

What about using the rel=nofollow

john269, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#8

Yes, nofollow can block PR but the links will still show up.

What you can do is 'cloak' by referrer. click.php should only be accessed from your site, so in PHP or other scripting languages you can check whether they came from your domain, if not show 404 or 301 to the homepage.

That form of cloacking is allwoed because you are not discriminating betwene end users and SE bots but by referer.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#9

Do you think I could get banned from the search engine or penelized in any way for allow the click.php script to get listed into Google? It is listed there for each product see and when someone clicks on the link it redirects straight to the merchants site and doesn't go to mine.

john269, Aug 12, 2006 IP

wrmineo Peon

Messages:: 3,087

Likes Received:: 379

Best Answers:: 0

Trophy Points:: 0

#10

T0PS3O said: ↑

Correct. If you aren't going to disallow anything, just use a blank robots.txt (to avoid 404) or none at all. No need to risk anything as you say.
Click to expand...

Yes, I notice that a lot of sites no longer use or don't even know to have a robots.txt file which will render "unfair" 404s against your site. Many bots ignore the file, but then register a 404 if they cannot locate one - the same is true for favico file; better to have than not IMO.

wrmineo, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#11

I doubt it. But fact is, it's a useless link so it's in their benefit for it to be removed. It might be easier to control the situation in that regard if you put click.php in a sub folder and block that folder. But I'd go with the referer checks. Also what you can do is add a token as a parameter with a simple script and only redirect valid tokens that are say under 60 seconds old. If not valid, redirect to homepage.

Quite a few options for you.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#12

The thing is it actually ranks higher than some of my main pages and I have got some sales this way just because google referred the traffic the the click.php script and then it redirected straight away to the merchants site.

I have just read up that passing PR to these merchants sites don't make me loose any of my sites or webpages PR. So I don't have to worry about loosing the PR.

But anyway, I am still worried about all these click.php listings being in Google. I don't want to get banned. So couldn't I just use the robots.txt file and have:

Disallow: /click.php

Wouldn't that just stop them listing the click.php page or does it have to be in a folder of it's own and then I put a disallow on that folder/directory?

Also, do you think it is necessary to put your include files into the robots.txt file, or isn't it really needed. The includes that I am on about are the php files for connecting to the database and things.

john269, Aug 12, 2006 IP

T0PS3O Feel Good PLC

Messages:: 13,219

Likes Received:: 777

Best Answers:: 0

Trophy Points:: 0

#13

In my experience, just blocking that file will not get them out of the SERPs. It made my title and snippets go away but the links remain.

T0PS3O, Aug 12, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#14

I have just gone and blocked that file on all my sites now. Well, lets just see how it goes. I dought it if it will go just like what you said.

john269, Aug 12, 2006 IP

MoneyElite.com Peon

Messages:: 11

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#15

What happen if spiders don't obey the file?

And how to test whether your robot.txt is working properly?

MoneyElite.com, Aug 14, 2006 IP

ewc21 Peon

Messages:: 455

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#16

MoneyElite.com said: ↑

What happen if spiders don't obey the file?
Click to expand...

You will encounter unexpected results.

ewc21, Aug 15, 2006 IP

MoneyElite.com Peon

Messages:: 11

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#17

What is the unexpected results?

ewc21 said: ↑

You will encounter unexpected results.
Click to expand...

MoneyElite.com, Aug 18, 2006 IP

jdevalk Active Member

Messages:: 417

Likes Received:: 25

Best Answers:: 0

Trophy Points:: 68

#18

Files will be indexed that you banned from indexing

jdevalk, Aug 20, 2006 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#19

john269 said: ↑

I have just found that using the following shouldn't be done:

User-agent: *
Disallow:
Click to expand...

This is not true.

There is no need to change it: your robots.txt is perfect. It allows all robots and it will be understood by all polite robots.

Jean-Luc

Jean-Luc, Aug 21, 2006 IP

john269 Notable Member

Messages:: 6,229

Likes Received:: 116

Best Answers:: 0

Trophy Points:: 235

#20

I prefer to just leave the robot txt file blank. If there is not polite robots that will understand it then I could get de-indexed, which will then mean that I will have less traffic or none from some search engines.

john269, Aug 21, 2006 IP

Log in or Sign up

Advertising (learn more)

About the search engine robot.txt file and SEO!

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

wrmineo Peon

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

MoneyElite.com Peon

ewc21 Peon

MoneyElite.com Peon

jdevalk Active Member

Jean-Luc Peon

john269 Notable Member

Log in or Sign up

Advertising (learn more)

About the search engine robot.txt file and SEO!

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

wrmineo Peon

T0PS3O Feel Good PLC

john269 Notable Member

T0PS3O Feel Good PLC

john269 Notable Member

MoneyElite.com Peon

ewc21 Peon

MoneyElite.com Peon

jdevalk Active Member

Jean-Luc Peon

john269 Notable Member

Useful Searches