Loans - Buy Anything On eBay - Finance - Loans - Free Ringtone

PDA

View Full Version : Ideal Robots.txt file for wordpress website


akhilesh243
Nov 19th 2007, 9:49 am
User-Agent: *
Disallow: /license.txt/
Disallow: /readme.html
Disallow: /wp-admin.php
Disallow: /wp-atom.php
Disallow: /wp-blog-header.php
Disallow: /wp-comments-popup.php
Disallow: /wp-commentsrss2.php
Disallow: /wp-comments-post.php
Disallow: /wp-config-sample.php
Disallow: /wp-config.php
Disallow: /wp-cron.php
Disallow: /wp-feed.php
Disallow: /wp-links-opml.php
Disallow: /wp-login.php
Disallow: /wp-mail.php
Disallow: /wp-pass.php
Disallow: /wp-rdf.php
Disallow: /wp-register.php
Disallow: /wp-rss.php
Disallow: /wp-rss2.php
Disallow: /wp-settings.php
Disallow: /wp-trackback.php
Disallow: /xmlrpc.php
Sitemap: http://www.yoursitename.com/sitemap.xml

Yosser
Nov 21st 2007, 3:23 pm
I read somewhere the ideal robots.txt for wordpress was;

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /category
Disallow: /tag
Disallow: /author
Disallow: /trackback
Disallow: /*trackback
Disallow: /*trackback*
Disallow: /*/trackback
Disallow: /*?*
Disallow: /*.html/$
Disallow: /*feed*

# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*


# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*

Sitemap: http://www.yoursite.com/sitemap.xml


#

akhilesh243
Nov 21st 2007, 7:40 pm
I read somewhere the ideal robots.txt for wordpress was;

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /category
Disallow: /tag
Disallow: /author
Disallow: /trackback
Disallow: /*trackback
Disallow: /*trackback*
Disallow: /*/trackback
Disallow: /*?*
Disallow: /*.html/$
Disallow: /*feed*

# Google Image
User-agent: Googlebot-Image
Disallow:
Allow: /*


# Google AdSense
User-agent: Mediapartners-Google*
Disallow:
Allow: /*

Sitemap: http://www.yoursite.com/sitemap.xml


#

Yes this is better than that which i wrote earlier.

Ladadadada
Nov 23rd 2007, 12:34 am
The robots.txt standard does not include stars other than in the "User-agent: *" line. Many robots will parse and honour stars but many also will not.

Why do you want to block so many robots anyway ? Most of the robots I get on my site belong to search engines. Every time they request a page from my site they include it in their index so my pages can end up as the result of a search.

I block certain sections that don't make sense for robots to visit or that cause robots to get stuck. (I once had GoogleBot download 2GB in a month because I had a dynamically generated link that it could follow forever.) but apart from that I allow robots unfettered access to my site.

I can control their behaviour at a finer level by using the robots meta tag and specifying noindex or nofollow depending on what I want them to do. There are also some tags that can let robots know which sections of an individual page should not be indexed but these are not part of a standard yet and hence every search engine supports different tags.

akhilesh243
Nov 23rd 2007, 1:38 am
The robots.txt standard does not include stars other than in the "User-agent: *" line. Many robots will parse and honour stars but many also will not.

Why do you want to block so many robots anyway ? Most of the robots I get on my site belong to search engines. Every time they request a page from my site they include it in their index so my pages can end up as the result of a search.

I block certain sections that don't make sense for robots to visit or that cause robots to get stuck. (I once had GoogleBot download 2GB in a month because I had a dynamically generated link that it could follow forever.) but apart from that I allow robots unfettered access to my site.

I can control their behaviour at a finer level by using the robots meta tag and specifying noindex or nofollow depending on what I want them to do. There are also some tags that can let robots know which sections of an individual page should not be indexed but these are not part of a standard yet and hence every search engine supports different tags.

I dont want to get my installed files of wordpress to get indexed.It actually dilutes the bots power.So i always try to concentrate on content pages.

knorbulyon
Dec 1st 2007, 2:49 am
very very thanks for robots.txt i have no robot.txt yet

Kuldeep1952
Dec 3rd 2007, 10:36 pm
very very thanks for robots.txt i have no robot.txt yet

knorbulyon - it is good to have a robots.txt. Even if you donot disallow any files, having a robots.txt will reduce 404 errors since all bots will in case look for it.

rustyb
Dec 10th 2007, 2:25 pm
I've been looking for something like this. Thank you.

arunsubru
May 12th 2008, 4:37 am
The format of robots.txt file which you have provided for a wordpress website is quite informational. I have been looking for a robots.txt file to update it to my wordpress blog on kerala real estate named www.keralarealpro.com. Thanks again.