1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Ideal Robots.txt file for wordpress website

Discussion in 'robots.txt' started by akhilesh243, Nov 19, 2007.

  1. #1
    User-Agent: *
    Disallow: /license.txt/
    Disallow: /readme.html
    Disallow: /wp-admin.php
    Disallow: /wp-atom.php
    Disallow: /wp-blog-header.php
    Disallow: /wp-comments-popup.php
    Disallow: /wp-commentsrss2.php
    Disallow: /wp-comments-post.php
    Disallow: /wp-config-sample.php
    Disallow: /wp-config.php
    Disallow: /wp-cron.php
    Disallow: /wp-feed.php
    Disallow: /wp-links-opml.php
    Disallow: /wp-login.php
    Disallow: /wp-mail.php
    Disallow: /wp-pass.php
    Disallow: /wp-rdf.php
    Disallow: /wp-register.php
    Disallow: /wp-rss.php
    Disallow: /wp-rss2.php
    Disallow: /wp-settings.php
    Disallow: /wp-trackback.php
    Disallow: /xmlrpc.php
    Sitemap: http://www.yoursitename.com/sitemap.xml
    SEMrush
     
    akhilesh243, Nov 19, 2007 IP
    SEMrush
  2. Yosser

    Yosser Active Member

    Messages:
    480
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #2
    I read somewhere the ideal robots.txt for wordpress was;

    User-agent: *
    Disallow: /cgi-bin
    Disallow: /wp-admin
    Disallow: /wp-includes
    Disallow: /wp-content/plugins
    Disallow: /wp-content/cache
    Disallow: /wp-content/themes
    Disallow: /category
    Disallow: /tag
    Disallow: /author
    Disallow: /trackback
    Disallow: /*trackback
    Disallow: /*trackback*
    Disallow: /*/trackback
    Disallow: /*?*
    Disallow: /*.html/$
    Disallow: /*feed*

    # Google Image
    User-agent: Googlebot-Image
    Disallow:
    Allow: /*


    # Google AdSense
    User-agent: Mediapartners-Google*
    Disallow:
    Allow: /*

    Sitemap: http://www.yoursite.com/sitemap.xml


    #
     
    Yosser, Nov 21, 2007 IP
  3. akhilesh243

    akhilesh243 Active Member

    Messages:
    574
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    60
    #3
    Yes this is better than that which i wrote earlier.
     
    akhilesh243, Nov 21, 2007 IP
  4. Ladadadada

    Ladadadada Peon

    Messages:
    382
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    0
    #4
    The robots.txt standard does not include stars other than in the "User-agent: *" line. Many robots will parse and honour stars but many also will not.

    Why do you want to block so many robots anyway ? Most of the robots I get on my site belong to search engines. Every time they request a page from my site they include it in their index so my pages can end up as the result of a search.

    I block certain sections that don't make sense for robots to visit or that cause robots to get stuck. (I once had GoogleBot download 2GB in a month because I had a dynamically generated link that it could follow forever.) but apart from that I allow robots unfettered access to my site.

    I can control their behaviour at a finer level by using the robots meta tag and specifying noindex or nofollow depending on what I want them to do. There are also some tags that can let robots know which sections of an individual page should not be indexed but these are not part of a standard yet and hence every search engine supports different tags.
     
    Ladadadada, Nov 22, 2007 IP
  5. akhilesh243

    akhilesh243 Active Member

    Messages:
    574
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    60
    #5
    I dont want to get my installed files of wordpress to get indexed.It actually dilutes the bots power.So i always try to concentrate on content pages.
     
    akhilesh243, Nov 23, 2007 IP
  6. knorbulyon

    knorbulyon Peon

    Messages:
    57
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    very very thanks for robots.txt i have no robot.txt yet
     
    knorbulyon, Dec 1, 2007 IP
  7. Kuldeep1952

    Kuldeep1952 Active Member

    Messages:
    290
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    60
    #7
    knorbulyon - it is good to have a robots.txt. Even if you donot disallow any files, having a robots.txt will reduce 404 errors since all bots will in case look for it.
     
    Kuldeep1952, Dec 3, 2007 IP
  8. rustyb

    rustyb Peon

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I've been looking for something like this. Thank you.
     
    rustyb, Dec 10, 2007 IP
  9. arunsubru

    arunsubru Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    The format of robots.txt file which you have provided for a wordpress website is quite informational. I have been looking for a robots.txt file to update it to my wordpress blog on kerala real estate named www.keralarealpro.com. Thanks again.
     
    arunsubru, May 12, 2008 IP