Ok, here is what happened. The IT guys in my company had bad experience before with robots.txt. Someone put a wrong code by mistake and all of the sudden all the listing disappeared from the google SERPs. Since that time, they decided to block it completely. SO if I upload robots.txt file into the root directory of any site on that server. It's basically doesn't see it. It's been blocked! When you type "http://www.verrado.com/robots.txt"in browser, it automatically redirects me to "http://www.verrado.com/sitemap.php" It happens because I implemented 404 redirect in .htaccess file. ErrorDocument 404 "http://www.verrado.com/sitemap.php" . Instead of creating 404 custom page, i just forwarded regular 404 page to the sitemap of the site. Now, one of these ITs says it's not a good idea to do that. Because SE robots may get confused. When they go to the site, the first thing they do is looking for robots.txt. In this case they don't find it, and they are forwarded to the SITE MAP. Now robots might think that the sitemap is a robots.txt and if the do so, they might realize that the robots.txt is too heavy and just refuse to crawl the site further. When I checked the http headers of the site map, here it is "http://www.delorie.com/web/headers.cgi?url=http%3A%2F%2Fwww.verrado.com%2Fsitemap.php" It shows me that there is no 404 error. Instead it is shows 200ok. In my opinion the crawlers will still crawl the site with my .htaccess tweak. But IT's want me to remove ErrorDocument 404 "http://www.verrado.com/sitemap.php" from .htaccess. And i know how beneficial is to have that feature to keep the visitors on the site. They refuse to enable robots.txt file on the server. This is nonsense to me. Why do you need a custom 404 page? Simply put, to increase the number of visitors to your web site. Everybody wants more people to visit their website. On average, 7% of visits to any given web site result in a 404 "not found" error page, according to web trends of popular web sites. If you can lure in most of the visitors that hit your 404 page, then you're increasing your web traffic. Please advice, Thank you
Firstly you should get some new IT guys. They stuffed up and fixed it with something stupid. How hard is it to make a robots.txt file? Next decide whether you actually need a robots.txt file, what are you using it for?
If there is no robots.txt file on the server. SE robots go to it first when they visit any site. The robot realize that robots.txt is not there. But instead of 404 page error, it redirects to the sitemap.php. And sees it 200ok. My ITs are concerned about that. Basically what they are telling me that the robots may get confused by that 200ok. It will consider sitemap.php as a robot.txt file and when it will begin to crawl it, it will think that the file is too big and will leave the site thus won't crawl the rest of the pages.
Well, thats not a question thats a statement. You are explaining what happens when search engines look at your robots.txt. I don't know what you want answering? If you are saying is it OK to 301 your robots.txt file to the sitemap then the answer is no.
Your IT guys are wrong. As Mad4 said, "get some new ones." The server is *not* returning a 404 status code for a robots.txt reference - it's a 302 (temporary move and absolutely the wrong thing to do). I would be concerned that there are/could be other things wrong with the server. If you use Firefox, install this extension which allows you to view all of the HTTP headers. http://livehttpheaders.mozdev.org/ Good luck. -jay http://www.verrado.com/robots.txt