This is what my robot.txt is setup can anyone say is it okay or i need to modify few things here Here is my robot.txt User-agent: * Disallow: /wp-content/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /cgi-bin/ User-agent: Googlebot Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /*.php* Disallow: */trackback* Disallow: /*?* Disallow: /z/ Disallow: /wp-* Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.txt$ If i have made any msitake please expert correct me i will rectify as from past few days i am not getting traffic as much as i got the last few weeks Regards
You can verify it in Googles webmaster tools. Far as I know, though, wildcards are NOT allowed in robots.txt files. Someone correct me if I'm wrong...I'd be elated if it were valid!
Not quite sure I follow what you're asking here, but to clarify, if you go into Google's Webmaster Tools, there is a place where you can submit your robots.txt and it will analyze it and inform you of any problems.
Not quite sure I never had the robot.txt file suddenly some where i saw in some forums to protect some directories and thats why i have created the robot.txt file Whats your advice shall robot.txt be there or not
Just realize that: a) The wildcard is "nonstandard" in the robots.txt (not all robots will use it as you intend). b) Not all robots even honor the robots.txt so it's no guarantee that robots won't crawl those places excluded by the file.
No man ... see i want all the robot , crawlers and spiders of all search engines to completely go thru my site ... Is it good to have the robot.txt file if no please tell me the goods and bads i will remove it
You indicated above that you wanted to "protect" some directories/forums. That's not the same as wanting all search engines to go completely through your site, though. If you don't want anyone in certain areas, a robots.txt is not the way to do it, since not all robots are well-behaved (and putting hidden or not-allowed places in there just informs them where it is) The best thing to do is to put restrictions (logins etc) on those areas.
Can you please tell any general procedure how to block robots/ spiders / crawlers entering into certain folders etc Hope you get it ... i want exactly that
It's pretty much impossible to deny all robots since you don't know what a robot is (not you specifically, I mean since there is no comprehensive list of IPs to define robots) so you would have to password protect (.htaccess) the folders.
I've heard of cases where the .htaccess folder protect actually ( which hold passwords ) had bee hacked or reverse MD5'd, posing a security risk if that password happens to be important.