Ideal Robots.txt file for wordpress website

Discussion in 'robots.txt' started by akhilesh243, Nov 19, 2007.

  1. #1
    User-Agent: *
    Disallow: /license.txt/
    Disallow: /readme.html
    Disallow: /wp-admin.php
    Disallow: /wp-atom.php
    Disallow: /wp-blog-header.php
    Disallow: /wp-comments-popup.php
    Disallow: /wp-commentsrss2.php
    Disallow: /wp-comments-post.php
    Disallow: /wp-config-sample.php
    Disallow: /wp-config.php
    Disallow: /wp-cron.php
    Disallow: /wp-feed.php
    Disallow: /wp-links-opml.php
    Disallow: /wp-login.php
    Disallow: /wp-mail.php
    Disallow: /wp-pass.php
    Disallow: /wp-rdf.php
    Disallow: /wp-register.php
    Disallow: /wp-rss.php
    Disallow: /wp-rss2.php
    Disallow: /wp-settings.php
    Disallow: /wp-trackback.php
    Disallow: /xmlrpc.php
    Sitemap: http://www.yoursitename.com/sitemap.xml
     
    akhilesh243, Nov 19, 2007 IP
  2. Yosser

    Yosser Active Member

    Messages:
    480
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #2
    I read somewhere the ideal robots.txt for wordpress was;

    User-agent: *
    Disallow: /cgi-bin
    Disallow: /wp-admin
    Disallow: /wp-includes
    Disallow: /wp-content/plugins
    Disallow: /wp-content/cache
    Disallow: /wp-content/themes
    Disallow: /category
    Disallow: /tag
    Disallow: /author
    Disallow: /trackback
    Disallow: /*trackback
    Disallow: /*trackback*
    Disallow: /*/trackback
    Disallow: /*?*
    Disallow: /*.html/$
    Disallow: /*feed*

    # Google Image
    User-agent: Googlebot-Image
    Disallow:
    Allow: /*


    # Google AdSense
    User-agent: Mediapartners-Google*
    Disallow:
    Allow: /*

    Sitemap: http://www.yoursite.com/sitemap.xml


    #
     
    Yosser, Nov 21, 2007 IP
  3. akhilesh243

    akhilesh243 Active Member

    Messages:
    574
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    60
    #3
    Yes this is better than that which i wrote earlier.
     
    akhilesh243, Nov 21, 2007 IP
  4. Ladadadada

    Ladadadada Peon

    Messages:
    382
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    0
    #4
    The robots.txt standard does not include stars other than in the "User-agent: *" line. Many robots will parse and honour stars but many also will not.

    Why do you want to block so many robots anyway ? Most of the robots I get on my site belong to search engines. Every time they request a page from my site they include it in their index so my pages can end up as the result of a search.

    I block certain sections that don't make sense for robots to visit or that cause robots to get stuck. (I once had GoogleBot download 2GB in a month because I had a dynamically generated link that it could follow forever.) but apart from that I allow robots unfettered access to my site.

    I can control their behaviour at a finer level by using the robots meta tag and specifying noindex or nofollow depending on what I want them to do. There are also some tags that can let robots know which sections of an individual page should not be indexed but these are not part of a standard yet and hence every search engine supports different tags.
     
    Ladadadada, Nov 22, 2007 IP
  5. akhilesh243

    akhilesh243 Active Member

    Messages:
    574
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    60
    #5
    I dont want to get my installed files of wordpress to get indexed.It actually dilutes the bots power.So i always try to concentrate on content pages.
     
    akhilesh243, Nov 23, 2007 IP
  6. knorbulyon

    knorbulyon Peon

    Messages:
    57
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    very very thanks for robots.txt i have no robot.txt yet
     
    knorbulyon, Dec 1, 2007 IP
  7. Kuldeep1952

    Kuldeep1952 Active Member

    Messages:
    290
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #7
    knorbulyon - it is good to have a robots.txt. Even if you donot disallow any files, having a robots.txt will reduce 404 errors since all bots will in case look for it.
     
    Kuldeep1952, Dec 3, 2007 IP
  8. rustyb

    rustyb Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I've been looking for something like this. Thank you.
     
    rustyb, Dec 10, 2007 IP
  9. arunsubru

    arunsubru Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    The format of robots.txt file which you have provided for a wordpress website is quite informational. I have been looking for a robots.txt file to update it to my wordpress blog on kerala real estate named www.keralarealpro.com. Thanks again.
     
    arunsubru, May 12, 2008 IP