Checking robots.txt?

Discussion in 'robots.txt' started by Tuning, Jun 7, 2005.

  1. #1
    Hi Dp Folks,

    Is there anyway I can check working of robots.txt ?

    I have recently done changes to robots.txt and is expecting something I want. But I'm not sure the current setting will work. :confused:

    Can anyone suggest some idea ?

    Regards,
    Tuning
     
    Tuning, Jun 7, 2005 IP
  2. noppid

    noppid gunnin' for the quota

    Messages:
    4,246
    Likes Received:
    232
    Best Answers:
    0
    Trophy Points:
    135
    #2
    noppid, Jun 7, 2005 IP
  3. Tuning

    Tuning Well-Known Member

    Messages:
    1,005
    Likes Received:
    51
    Best Answers:
    0
    Trophy Points:
    138
    #3
    Thanks noppid,

    But it seems that is not what I'm looking for. I wanted how SE's view my pages following robots.txt instructions.

    Do you know any tools ?
     
    Tuning, Jun 7, 2005 IP
  4. noppid

    noppid gunnin' for the quota

    Messages:
    4,246
    Likes Received:
    232
    Best Answers:
    0
    Trophy Points:
    135
    #4
    I don't understand exactly what you mean? What site is the file at? The source will tell.
     
    noppid, Jun 7, 2005 IP
  5. Tuning

    Tuning Well-Known Member

    Messages:
    1,005
    Likes Received:
    51
    Best Answers:
    0
    Trophy Points:
    138
    #5
    This is the site :

    forums.matrixweb.org

    The pages got dropped from google index. It was found that my robots.txt was wrong. Hence it was updated and I'm unsure it will work or not. :confused:

    The problem is duplicate contents. same pages have3 urls.
    User-agent: *
    Disallow: /post-*.html$ 
    Disallow: /updates-topic.html*$ 
    Disallow: /stop-updates-topic.html*$ 
    Disallow: /ptopic*.html$ 
    Disallow: /ntopic*.html$
    Code (markup):
    Thanks,
    Tuning :)
     
    Tuning, Jun 8, 2005 IP
  6. noppid

    noppid gunnin' for the quota

    Messages:
    4,246
    Likes Received:
    232
    Best Answers:
    0
    Trophy Points:
    135
    #6
    IIRC, you can't use wildcards in the paths. :)
     
    noppid, Jun 8, 2005 IP
  7. Tuning

    Tuning Well-Known Member

    Messages:
    1,005
    Likes Received:
    51
    Best Answers:
    0
    Trophy Points:
    138
    #7
    But noppid , this was the exact code I got from able2know mod.
    # 
    #-----[ OPEN ]------------------------------------------ 
    #  
    
    robots.txt 
    
    Disallow: forums/post-*.html$ 
    Disallow: forums/updates-topic.html*$ 
    Disallow: forums/stop-updates-topic.html*$ 
    Disallow: forums/ptopic*.html$ 
    Disallow: forums/ntopic*.html$ 
    Code (markup):
    And as far as i can understand ( sorry for my n00bness :eek: ) they built this mod for www.domain.com/forums/

    And for my forum, it is on a subdomain and hence I removed the "forums" part.

    Regards,
    Tuning :)
     
    Tuning, Jun 8, 2005 IP
  8. noppid

    noppid gunnin' for the quota

    Messages:
    4,246
    Likes Received:
    232
    Best Answers:
    0
    Trophy Points:
    135
    #8
    Big discussion at DP: http://forums.digitalpoint.com/showthread.php?t=6894

    I have no clue why they made it that way. Wildcards don't work in the path. There are many many places to verify that. http://www.aim-pro.com/helpfiles/robots-txt.html

    I dunno on that one.

    Also, depending on how your server does the redirect for the subdomain, the robots.txt file may not be found in the subdomain folder. Bots may be looking for it in the root folder. You can probably tell which is getting hit in the control panel to sort that out.
     
    noppid, Jun 8, 2005 IP
  9. Tuning

    Tuning Well-Known Member

    Messages:
    1,005
    Likes Received:
    51
    Best Answers:
    0
    Trophy Points:
    138
    #9
    Thanks noppid. Thats great info. I will check the cpanel and see what is in there.

    Thanks for the help. :)
     
    Tuning, Jun 8, 2005 IP
  10. noppid

    noppid gunnin' for the quota

    Messages:
    4,246
    Likes Received:
    232
    Best Answers:
    0
    Trophy Points:
    135
    #10
    Glad to help. I learned a little too. It's not like I knew all that without some research. :D
     
    noppid, Jun 8, 2005 IP