Could you please suggest to me how can i get rid of a 404 error whenever a spider bot comes to this site? As far as i know, i already have a robot.txt, however i cannot seem to get any headway. The robot.txt section in the forum says much about creation, but nothing much on whether i should chage its permission or what? i constantly getting errors similar to this: Date: 03-21-2006[00:19:56] Robot request for: http://www.linguagymnastics.com/robots.txt was not found! IP address: 72.30.97.225 Browser: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) Referred by: n/a Error Code: 404 Thanks to all who read up to this point, really appreciate your know-how and comments
you have an auto-redirect to http://www.linguagymnastics.com/ and then http://www.linguagymnastics.com/blog/ I guess this is your problem. check your .htaccess
Hi No the problem won't be gone. I wrote about the need for websites to use robots.txt in order to hopefully rank well in the search engines. http://www.seochat.com/c/a/Search-Engine-Optimization-Help/Write-a-Robotstxt-File/ The 404 is the robots hitting your site asking for your robots.txt file since they don't find one they probably end up leaving. Install one and all pages are usually indexed quickly and fully. Hope this helps
He has one, but due to a redirect it's not being found. You might be able to place it under yourdomain.com/blog or otherwise a rule can be written to redirect all except request for robots.txt. post your .htaccess and maybe myself or someone else can help.
It's easy to solve. Rename your file to robots.txt. Currently it is called robot.txt -- missing the letter 's' -- should be plural, not singular.
.htacess is blank now at root level. .htacess at /blog level only contains code for my Wordpress permalinks.
Ahhh McFox is right http://www.linguagymnastics.com/robot.txt Your file is misspelled add an s after robot
Check logs Minstrel.....I see robots requesting the file...if its not there it is a 404 error...some bots will leave. Also I think you missed a word in my quote.
minstrel is right... robots.txt is not required and won't help ranking. It's just some guideliness about what not to check (and what to check) for crawlers. Some respect it, others don't. They won't leave if one doesn't exist. They might leave depending on what's in robots.txt and/or robots meta tags.
Now if only you were every robot you would be a good source to dispute such... but as your not ...I believe your thinking is somewhat flawed. Research the subject some. You will be surprised what you learn when you look past what you think you know. Also I would recommend research to be done on data mining scripts and not SEO related issues. Thanks for the input....
Since I have a hard time getting my points across for whatever reason.. I direct you to the Google engineer who it seems everyone will beluieve You can finish reading the rest at this link as I don't need to repost the whole thing here http://www.mattcutts.com/blog/new-robotstxt-tool/ If the directives do not match what the spider was programmed for... then it will most certainly leave. Spider bots are very intense and when they hammer on your server they can make it crash...do not underestimate their abilities.
You have totally misunderstood what the comments you have cited are saying, Sem-Advance. Nowhere in there does it say that bot will leave, or even probably leave, if you don't have a robots.txt file. What they are saying is that if you mess up your robots.txt file, you may create a problem for spiders on your site. In other words, a bad robots.txt file is a problem; NO robots.txt file is not - unless you have files or directories you do not want indexed. Let me help by rewording your comment:
Dear Minstel Why would you reword my comment?? I stick by what I post. I type perfectly fine as you can see since this post is following yours. How would you like me to reword comments you make you feel are correct and then post them around the internet?? I doubt you would so show me the same courtesy! Next I cited one source not all that I have read. You have cited none. Do me a favor and look in your log file..tell me how many robots crawl your site?? Any idea why more do not ??? Now for those of you who have websites listed on only one or two of the three majors and do not have a robots.txt file...install one and your site will soon show on all three...(barring any spam or coding issues of your pages).
Sem-Advance, I suggest you start checking some of the major sites indexed by google, msn, yahoo or any other SE and look for a robots.txt. Many don't have one. robots.txt is just a suggestion, not a standard. For example, www.cnn.com is in all those search engines and more ... but lo-and-behold no robots.txt. It's by no means essential at all, it can be helpful, especially to ensure folders and pages that you don't want indexing aren't. And if it's written incorrectly it might stop robots indexing pages that you do want indexing. However, a lack of robots.txt doesn't matter whatsoever.
More study places http://www.robotstxt.org/wc/faq.html http://www.robotstxt.org/wc/eval.html http://www.robotstxt.org/wc/threat-or-treat.html http://www.w3.org/robots.txt