1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

About the search engine robot.txt file and SEO!

Discussion in 'robots.txt' started by john269, Aug 12, 2006.

  1. #1
    Hi,

    I have just found that using the following shouldn't be done:

    User-agent: *
    Disallow:

    It is a robots.txt file that allows everything, but apparently some search engines may miss read into thinking that it is banning every robot.

    Is this true as I have this robots.txt file like the above in alot of my directories.

    I read about it here: http://www.seoconsultants.com/robots-text-file/#not-recommended

    So you think that instead of using a robots.txt file with

    User-agent: *
    Disallow:

    in it, then I might aswell just not have a robots.txt file. Could some robots/ serch engines thing I don't want to them crawl my site using

    User-agent: *
    Disallow:

    Please advise me on this as I really want to know.

    Thanks!
     
    john269, Aug 12, 2006 IP
  2. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Correct. If you aren't going to disallow anything, just use a blank robots.txt (to avoid 404) or none at all. No need to risk anything as you say.
     
    T0PS3O, Aug 12, 2006 IP
  3. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #3
    I have just looked through Yahoo and MSN and it looks as if they have not crawled my site properly, especially Yahoo.

    It looks as if alot of the information in Yahoo is old stuff.

    So using a blank robot.txt file is ok then. I prefer to use something so that I don't have the 404 error all the time.
     
    john269, Aug 12, 2006 IP
  4. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Consider using Google Sitemaps and Yahoo Feeds. Robots.txt is primarily to tell them what NOT to do. Sitemaps can guide them to where you want them.
     
    T0PS3O, Aug 12, 2006 IP
  5. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #5
    Here I go again!

    I have a product search engine where my main aim is to sell products for merchants and not to give their site PR.

    Well, I use a click.php script so that I can keep a track of the clicks and also so that it can then redirect to the merchants site. I have found that this clicks.php file for every product is being indexed in the search engines, which could be bad as as soon as someone clicks the listing it redirects to the merchants site straight away, which may make it look like a doorway page or something.

    Well anyway, I am probably also leaking alot of PR to these merchant sites as the search engines are crawling it and passing pr to the merchants.

    I was thinking that if I use a robots.txt file to stop the search engines from crawling the clicks.php file then it will not be listed in the search engines plus will it also mean that I will not loose any PR as the robots will not follow the links to the merchants site?
     
    john269, Aug 12, 2006 IP
  6. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #6
    There's a difference between not being crawled and actually showing up in the listings. When you block something in robots.txt doesn't mean the SE will deny existance of that link. It just won't use its content.
     
    T0PS3O, Aug 12, 2006 IP
  7. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #7
    Is there anyway I can block the existance of that link then to stop them gaining PR. I need to stop them again pr as I am promoting their products and it is not meant for them to gain pr, but for me to sell their products.

    What about using the rel=nofollow
     
    john269, Aug 12, 2006 IP
  8. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Yes, nofollow can block PR but the links will still show up.

    What you can do is 'cloak' by referrer. click.php should only be accessed from your site, so in PHP or other scripting languages you can check whether they came from your domain, if not show 404 or 301 to the homepage.

    That form of cloacking is allwoed because you are not discriminating betwene end users and SE bots but by referer.
     
    T0PS3O, Aug 12, 2006 IP
  9. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #9
    Do you think I could get banned from the search engine or penelized in any way for allow the click.php script to get listed into Google? It is listed there for each product see and when someone clicks on the link it redirects straight to the merchants site and doesn't go to mine.
     
    john269, Aug 12, 2006 IP
  10. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Yes, I notice that a lot of sites no longer use or don't even know to have a robots.txt file which will render "unfair" 404s against your site. Many bots ignore the file, but then register a 404 if they cannot locate one - the same is true for favico file; better to have than not IMO.
     
    wrmineo, Aug 12, 2006 IP
  11. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #11
    I doubt it. But fact is, it's a useless link so it's in their benefit for it to be removed. It might be easier to control the situation in that regard if you put click.php in a sub folder and block that folder. But I'd go with the referer checks. Also what you can do is add a token as a parameter with a simple script and only redirect valid tokens that are say under 60 seconds old. If not valid, redirect to homepage.

    Quite a few options for you.
     
    T0PS3O, Aug 12, 2006 IP
  12. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #12
    The thing is it actually ranks higher than some of my main pages and I have got some sales this way just because google referred the traffic the the click.php script and then it redirected straight away to the merchants site.

    I have just read up that passing PR to these merchants sites don't make me loose any of my sites or webpages PR. So I don't have to worry about loosing the PR.

    But anyway, I am still worried about all these click.php listings being in Google. I don't want to get banned. So couldn't I just use the robots.txt file and have:

    Disallow: /click.php

    Wouldn't that just stop them listing the click.php page or does it have to be in a folder of it's own and then I put a disallow on that folder/directory?

    Also, do you think it is necessary to put your include files into the robots.txt file, or isn't it really needed. The includes that I am on about are the php files for connecting to the database and things.
     
    john269, Aug 12, 2006 IP
  13. T0PS3O

    T0PS3O Feel Good PLC

    Messages:
    13,219
    Likes Received:
    777
    Best Answers:
    0
    Trophy Points:
    0
    #13
    In my experience, just blocking that file will not get them out of the SERPs. It made my title and snippets go away but the links remain.
     
    T0PS3O, Aug 12, 2006 IP
  14. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #14
    I have just gone and blocked that file on all my sites now. Well, lets just see how it goes. I dought it if it will go just like what you said.
     
    john269, Aug 12, 2006 IP
  15. MoneyElite.com

    MoneyElite.com Peon

    Messages:
    11
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #15
    What happen if spiders don't obey the file?

    And how to test whether your robot.txt is working properly?
     
    MoneyElite.com, Aug 14, 2006 IP
  16. ewc21

    ewc21 Peon

    Messages:
    455
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #16
    You will encounter unexpected results.
     
    ewc21, Aug 15, 2006 IP
  17. MoneyElite.com

    MoneyElite.com Peon

    Messages:
    11
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #17
    What is the unexpected results?

     
    MoneyElite.com, Aug 18, 2006 IP
  18. jdevalk

    jdevalk Active Member

    Messages:
    417
    Likes Received:
    25
    Best Answers:
    0
    Trophy Points:
    68
    #18
    Files will be indexed that you banned from indexing :)
     
    jdevalk, Aug 20, 2006 IP
  19. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #19
    This is not true.

    There is no need to change it: your robots.txt is perfect. It allows all robots and it will be understood by all polite robots.

    Jean-Luc
     
    Jean-Luc, Aug 21, 2006 IP
  20. john269

    john269 Notable Member

    Messages:
    6,229
    Likes Received:
    116
    Best Answers:
    0
    Trophy Points:
    235
    #20
    I prefer to just leave the robot txt file blank. If there is not polite robots that will understand it then I could get de-indexed, which will then mean that I will have less traffic or none from some search engines.
     
    john269, Aug 21, 2006 IP