1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Have robot.txt. but always 404

Discussion in 'robots.txt' started by j3r0m3, Mar 23, 2006.

  1. #1
    Could you please suggest to me how can i get rid of a 404 error whenever a spider bot comes to this site? As far as i know, i already have a robot.txt, however i cannot seem to get any headway. The robot.txt section in the forum says much about creation, but nothing much on whether i should chage its permission or what?

    i constantly getting errors similar to this:

    Date: 03-21-2006[00:19:56]
    Robot request for: http://www.linguagymnastics.com/robots.txt was not found!
    IP address: 72.30.97.225
    Browser: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)
    Referred by: n/a

    Error Code: 404

    Thanks to all who read up to this point, really appreciate your know-how and comments
     
    j3r0m3, Mar 23, 2006 IP
  2. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #2
    ryan_uk, Mar 23, 2006 IP
  3. j3r0m3

    j3r0m3 Peon

    Messages:
    161
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    so after clearing the .htaccess, my problem should be gone?
     
    j3r0m3, Mar 23, 2006 IP
  4. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #4
    Hi

    No the problem won't be gone.

    I wrote about the need for websites to use robots.txt in order to hopefully rank well in the search engines.

    http://www.seochat.com/c/a/Search-Engine-Optimization-Help/Write-a-Robotstxt-File/

    The 404 is the robots hitting your site asking for your robots.txt file since they don't find one they probably end up leaving.

    Install one and all pages are usually indexed quickly and fully.

    Hope this helps
     
    Sem-Advance, Mar 23, 2006 IP
  5. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #5
    He has one, but due to a redirect it's not being found. You might be able to place it under yourdomain.com/blog or otherwise a rule can be written to redirect all except request for robots.txt. post your .htaccess and maybe myself or someone else can help.
     
    ryan_uk, Mar 23, 2006 IP
  6. mcfox

    mcfox Wind Maker

    Messages:
    7,526
    Likes Received:
    716
    Best Answers:
    0
    Trophy Points:
    360
    #6
    It's easy to solve. Rename your file to robots.txt. Currently it is called robot.txt -- missing the letter 's' -- should be plural, not singular.
     
    mcfox, Mar 23, 2006 IP
    minstrel and ryan_uk like this.
  7. j3r0m3

    j3r0m3 Peon

    Messages:
    161
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #7
    .htacess is blank now at root level.

    .htacess at /blog level only contains code for my Wordpress permalinks.
     
    j3r0m3, Mar 23, 2006 IP
  8. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #8
    Sem-Advance, Mar 23, 2006 IP
  9. j3r0m3

    j3r0m3 Peon

    Messages:
    161
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #9
    thank you all
     
    j3r0m3, Mar 23, 2006 IP
  10. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #10
    No they won't. That's simply not correct.

    (Didn't anyone mention that at seochat?)
     
    minstrel, Apr 1, 2006 IP
  11. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #11
    Check logs Minstrel.....I see robots requesting the file...if its not there it is a 404 error...some bots will leave.

    Also I think you missed a word in my quote.

     
    Sem-Advance, Apr 2, 2006 IP
  12. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #12
    minstrel is right...

    robots.txt is not required and won't help ranking. It's just some guideliness about what not to check (and what to check) for crawlers. Some respect it, others don't. They won't leave if one doesn't exist. They might leave depending on what's in robots.txt and/or robots meta tags.
     
    ryan_uk, Apr 2, 2006 IP
  13. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #13
    Now if only you were every robot you would be a good source to dispute such... but as your not ...I believe your thinking is somewhat flawed.

    Research the subject some.

    You will be surprised what you learn when you look past what you think you know.

    Also I would recommend research to be done on data mining scripts and not SEO related issues.

    Thanks for the input....
     
    Sem-Advance, Apr 2, 2006 IP
  14. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #14
    Since I have a hard time getting my points across for whatever reason..

    I direct you to the Google engineer who it seems everyone will beluieve

    You can finish reading the rest at this link as I don't need to repost the whole thing here

    http://www.mattcutts.com/blog/new-robotstxt-tool/

    If the directives do not match what the spider was programmed for... then it will most certainly leave.

    Spider bots are very intense and when they hammer on your server they can make it crash...do not underestimate their abilities.
     
    Sem-Advance, Apr 2, 2006 IP
  15. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #15
    You have totally misunderstood what the comments you have cited are saying, Sem-Advance.

    Nowhere in there does it say that bot will leave, or even probably leave, if you don't have a robots.txt file. What they are saying is that if you mess up your robots.txt file, you may create a problem for spiders on your site.

    In other words, a bad robots.txt file is a problem; NO robots.txt file is not - unless you have files or directories you do not want indexed.

    Let me help by rewording your comment:

     
    minstrel, Apr 2, 2006 IP
  16. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230
    #16
    Dear Minstel

    Why would you reword my comment??:mad:

    I stick by what I post. I type perfectly fine as you can see since this post is following yours.

    How would you like me to reword comments you make you feel are correct and then post them around the internet??

    I doubt you would so show me the same courtesy!

    Next I cited one source not all that I have read. You have cited none.

    Do me a favor and look in your log file..tell me how many robots crawl your site?? Any idea why more do not ???

    Now for those of you who have websites listed on only one or two of the three majors and do not have a robots.txt file...install one and your site will soon show on all three...(barring any spam or coding issues of your pages).
     
    Sem-Advance, Apr 2, 2006 IP
  17. EGS

    EGS Notable Member

    Messages:
    6,078
    Likes Received:
    438
    Best Answers:
    0
    Trophy Points:
    290
    #17
    Your file is saved as robot.txt ... you need to rename to robots.txt
    See if that is the problem! :D
     
    EGS, Apr 2, 2006 IP
  18. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #18
    Sem-Advance, you are completely and utterly wrong about this issue. I suggest you give it up.
     
    minstrel, Apr 2, 2006 IP
    ryan_uk likes this.
  19. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #19
    Sem-Advance, I suggest you start checking some of the major sites indexed by google, msn, yahoo or any other SE and look for a robots.txt. Many don't have one. robots.txt is just a suggestion, not a standard. For example, www.cnn.com is in all those search engines and more ... but lo-and-behold no robots.txt. It's by no means essential at all, it can be helpful, especially to ensure folders and pages that you don't want indexing aren't. And if it's written incorrectly it might stop robots indexing pages that you do want indexing. However, a lack of robots.txt doesn't matter whatsoever.
     
    ryan_uk, Apr 2, 2006 IP
  20. Sem-Advance

    Sem-Advance Notable Member

    Messages:
    6,179
    Likes Received:
    296
    Best Answers:
    0
    Trophy Points:
    230