I am trying to create a robots.txt file customized for wordpress blogs. here is what I have found across the net: User-agent: * Disallow: /wp- Disallow: /uploads/ Disallow: /feed/ Disallow: /comments/feed Disallow: /feed/$ does anyone familiar with those files know if that code is right and if I should include something else? some people told me that Googlebot ignores the "User-agent: *", is it true?
yes, the code you gave is correct and most probably that is the best policy to save your bandwidth from robot being taken unwanted files of your blog. But that is not true that googlebot avoid *. But if you still have some confusion then i would suggest use another same code after or below that code by replacing * to Googlebot
Not true, but it is a bit complicated: 1. Googlebot only ignores "User-agent: *" when there is a "User-agent: Googlebot" directive. This is compliant with the robots.txt standard. 2. You should not use * or $ within the "Disallow:" directives that follow "User-agent: *". These directives should not include proprietary syntax. If you want to use Google proprietary syntax, then you need to use "User-agent: Googlebot". 3. Anyway, "Disallow: /feed/$" is not necessary as it is included in "Disallow: /feed/". Jean-Luc
Try this, User-agent: * Disallow: */feed* Disallow: */wp-admin Disallow: */wp-content Disallow: */wp-includes Disallow: *wp-login.php Good Luck
i checked the specifications, you are concerned that the "Disallow: */wp-admin" might not work due to the * in front of it?
Yep. It does not work with most robots, while "Disallow: /blog_directory/wp-admin/" works with all robots. Jean-Luc
alright, sticking back to my old file then, here is what it looks so far: User-agent: * Disallow: /wp-admin/ Disallow: /wp-content/ Disallow: /wp-includes/ Disallow: /feed Disallow: /comments
You could simplify the "wp-..." part, like this: User-agent: * Disallow: /wp- Code (markup): Regarding "/feed/" and "/comments", there are two potential problems: - not all feeds are in the root directory of your blog - I do not know what feed readers do when they come in a blog where robots.txt disallows access to the feeds. For these reasons, I do not disallow the feeds in my blogs. Jean-Luc
Here is my robots.txt for my Wordpress 2.1 blog. Check out the detailed article about using robots.txt on wordpress for SEO. User-agent: * # disallow files in /cgi-bin Disallow: /cgi-bin/ Disallow: /comments/ Disallow: /z/j/ Disallow: /z/c/ # disallow all files ending in .php Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.txt$ #disallow all files in /wp- directorys Disallow: /wp-*/ # disallow all files with ? in url Disallow: /*? Disallow: /stats* Disallow: /dh_ Disallow: /about/legal-notice/ Disallow: /about/copyright-policy/ Disallow: /about/terms-and-conditions/ Disallow: /about/feed/ Disallow: /about/trackback/ Disallow: /contact/ Disallow: /tag Disallow: /docs* Disallow: /manual* Disallow: /category/uncategorized* Code (markup):
Sorry for waken up an old post but since there is already a post here about things i want to ask i think its better to start there then open a new one. On this page for WordPress i found this text User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /category/*/* Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /*?* Disallow: /*? Allow: /wp-content/uploads # Google Image User-agent: Googlebot-Image Disallow: Allow: /* # Google AdSense User-agent: Mediapartners-Google* Disallow: Allow: /* # Internet Archiver Wayback Machine User-agent: ia_archiver Disallow: / # digg mirror User-agent: duggmirror Disallow: / Is this to much or is it a good one?
Plz tell me how to stop search engines to crawl duplicate posts in my blog as its not good for my blog and google will penalize the blog, so give me a good example of making robots.txt, i am using %postname% permalink in my blog.
Check out: Robots.txt NoFollow and NoIndex Secrets in Matt Cutts Interview Use my robots.txt for an example.. if you want proof that it works, try doing a site:www.askapache.com search on google. Good luck!