Hi Dp Folks, Is there anyway I can check working of robots.txt ? I have recently done changes to robots.txt and is expecting something I want. But I'm not sure the current setting will work. Can anyone suggest some idea ? Regards, Tuning
Thanks noppid, But it seems that is not what I'm looking for. I wanted how SE's view my pages following robots.txt instructions. Do you know any tools ?
This is the site : forums.matrixweb.org The pages got dropped from google index. It was found that my robots.txt was wrong. Hence it was updated and I'm unsure it will work or not. The problem is duplicate contents. same pages have3 urls. User-agent: * Disallow: /post-*.html$ Disallow: /updates-topic.html*$ Disallow: /stop-updates-topic.html*$ Disallow: /ptopic*.html$ Disallow: /ntopic*.html$ Code (markup): Thanks, Tuning
But noppid , this was the exact code I got from able2know mod. # #-----[ OPEN ]------------------------------------------ # robots.txt Disallow: forums/post-*.html$ Disallow: forums/updates-topic.html*$ Disallow: forums/stop-updates-topic.html*$ Disallow: forums/ptopic*.html$ Disallow: forums/ntopic*.html$ Code (markup): And as far as i can understand ( sorry for my n00bness ) they built this mod for www.domain.com/forums/ And for my forum, it is on a subdomain and hence I removed the "forums" part. Regards, Tuning
Big discussion at DP: http://forums.digitalpoint.com/showthread.php?t=6894 I have no clue why they made it that way. Wildcards don't work in the path. There are many many places to verify that. http://www.aim-pro.com/helpfiles/robots-txt.html I dunno on that one. Also, depending on how your server does the redirect for the subdomain, the robots.txt file may not be found in the subdomain folder. Bots may be looking for it in the root folder. You can probably tell which is getting hit in the control panel to sort that out.
Thanks noppid. Thats great info. I will check the cpanel and see what is in there. Thanks for the help.