Please review my robots.txt code

Discussion in 'robots.txt' started by chadhaajay, Oct 10, 2013.

  1. #1
    I have following code in my robots.txt file

    User-agent: *
    Disallow: /index.php/*.html$
    Allow: /

    The reason I added wild-card character in the "Disallow" is to block certain links of our site. So, will it work to block the following type of links?

    http://www.example.com/index.php/images/article-737.html
    http://www.example.com/index.php/category-9.html
     
    chadhaajay, Oct 10, 2013 IP
  2. NadaBolt

    NadaBolt Active Member

    Messages:
    40
    Likes Received:
    4
    Best Answers:
    2
    Trophy Points:
    53
    #2
    the syntax you have used within "Disallow" is only applicable to Googlebot (Not for other search engine robots). So if you are going to use "*.html$" kind of syntax, you must put User-agent:Googlebot
     
    NadaBolt, Nov 15, 2013 IP
  3. makeit easy

    makeit easy Active Member

    Messages:
    2,067
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    90
    #3
    Yes it should work. Your syntax is correct.
    "Allow" line is not needed and better if removed.

    However if possible, I would prefer adding "noindex" metatags to my pages or redirect them.
    I think blocking the robots should be the last resort.
     
    makeit easy, Nov 17, 2013 IP