Hello, Is a robot text file important? or is this an outdated idea as search engine may ignore them? Would it increase site visibility to the search engines? or would it make no difference?
You should have one, some bots won't crawl without a robots.txt, and it's at least worth it because stats programs (like awstats) identify some search bots because of their hits on the robots file. Just upload a blank one, it's good practice.
Let me tell you that I can verify that the robots.txt file is extremely important. Basically, it's the first thing that most search engine bots are going to look for. If they don't find it, then they will leave and come back later to crawl the site....maybe. I recently made the mistake of failing to upload the robots.txt file to one of my larger sites when I did a site redesign. Subsequently, our web hosting company had an issue in the configuration of the web server, where unknown pages were not returning the 404 status header. So... the bot looks for robots.txt, it's not there, it ONLY assumes that it can continue crawling if it gets the 404 status header. If not, then it will come back later. In my case, the robot kept coming back to look for it and never found it, never got a 404 either, never crawled my site. I first dropped from MSN and had no idea why. Then I dropped from Google and it wasn't until I started looking at this more carefully that I discovered and fixed the 404 issue. I noticed that Googlebot and MSNbot had not been crawling the site. After the issues were resolved, I saw both bots in my logs and within a few days, I was re-indexed.
I agree with the above, if you dont want to limit access at least upload a black one. Also, without the blank file you will see a lot of 404 errors in the logs when the bots are looking for your robots.txt file. Good luck.
Thanks for the info, I've had 766 hits of 404 this month alone! So if there was nothing I did not want indexed I would just place the following code in the robots file? Is there anything else you put in there? eg like block e mail collection bots?
your double negatives make it hard to answer your question, but... If you want to give full access, place an empty file
like ppl say, its better if you have one well configured, but this doesnt mean robots wont index your site without that file.
Great points on the 404 issue! Also, if you have folders that you use for your purposes only, or a staging ground before you launch and then move up to the root directory, you're likely to get even more 404 errors » I recently experienced this first hand. Also, besides excluding folders you don't want accessed, you can also exclude some known "spam" bots or those not important to you.
Is there anyone who is facing a problem tht bluddy ghosty bots crawling the site & eating up your bandwidth.. i got some entries in my log like... Unknown robot (identified by 'spider')4070 Pages Crawled - 106 MB Crawled Unknown robot (identified by 'crawl')1806 Pages Crawled - 50 MB Crawled So is there someone has any idea on tht...
A robots.txt is good if you want to limit the bots and what they index. I always disallow my images from the bots so they dont waste time indexing all images. Other good things to use a robots file for is to specify where your sitemap is located. If you dont want to limit the bots you dont need a robots.txt really.
You should still have a robots.txt file. It will help the bots to spider your site easily without having to work harder. Some bots also will just leave your site if they do not see a robots.txt file. That's the first thing they look for.
Robots.txt file is the first file that a crawler visits. It gives crawlers an idea what pages need to be indexed and what should be ignored.
the jury is still out there - my take, based on some experiments I did - it tells search engines (mostly Google) what YOU want/don't want included in their index, but it is not really blocking them from accessing the folders. use .htaccess if you want to protect folders, and mask your plugins in WP.
It is not an outdated factor for SEO purpose. It is quite interesting one still to tell the Search Engines Robots to crawl which file or to not.