Robots.txt Question

Discussion in 'Search Engine Optimization' started by jooles, Aug 7, 2007.

  1. #1
    Hello

    On my site, I have a printable version of each article at:

    mysite.com/articlename/print

    To get these print views not viewed by search engines (to prevent duplicate content penalty), my robots.txt file looks like this:

    User-agent: *
    Disallow:/*/print/

    Also, I'm thinking of adding my admin directory to my robots.txt file, and i had a question. Couldn't people who want to hack your site find all the important directories by just looking at the robots.txt file? Should I add my admin directory to the robots file or not?
     
    jooles, Aug 7, 2007 IP
  2. Silver89

    Silver89 Notable Member

    Messages:
    2,243
    Likes Received:
    72
    Best Answers:
    0
    Trophy Points:
    205
    #2
    If you add,

    IndexIgnore *


    to your .htaccess, then people won;t see the files in the directories
     
    Silver89, Aug 7, 2007 IP
  3. jooles

    jooles Peon

    Messages:
    61
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I just add that one line?

    Another question, how can I see the files if I add that line?

    Thanks!
     
    jooles, Aug 7, 2007 IP
  4. Silver89

    Silver89 Notable Member

    Messages:
    2,243
    Likes Received:
    72
    Best Answers:
    0
    Trophy Points:
    205
    #4
    You will see the files in your ftp because your logged in and have privilidgesm however if someone without tries to browse they won't/ shouldn't see anything
     
    Silver89, Aug 7, 2007 IP
  5. trichnosis

    trichnosis Prominent Member

    Messages:
    13,785
    Likes Received:
    333
    Best Answers:
    0
    Trophy Points:
    300
    #5
    i dont think that people can use use your print urls to hack.

    google and big the other big boys are suggesting to use robots.txt
     
    trichnosis, Aug 8, 2007 IP
  6. jooles

    jooles Peon

    Messages:
    61
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Can you confirm that I robots.txt-ed my print files?
     
    jooles, Aug 8, 2007 IP
  7. web_mehul

    web_mehul Banned

    Messages:
    248
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #7
    User-agent: *
    Disallow:/*/print/
    I think this will not work

    To disallow the print directory and it's subdirectory you need to put like
    User-agent: *
    Disallow: /print/

    That's it.
     
    web_mehul, Aug 9, 2007 IP
  8. acwebguru

    acwebguru Guest

    Messages:
    152
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #8
    I'm agree with "web_mehul " :)
     
    acwebguru, Aug 9, 2007 IP
  9. Danltn

    Danltn Well-Known Member

    Messages:
    679
    Likes Received:
    36
    Best Answers:
    0
    Trophy Points:
    120
    #9
    User-agent: *
    Disallow:/*/print/
    I think this will not work

    To disallow the print directory and it's subdirectory you need to put like
    User-agent: *
    Disallow: /print/

    Ah, but does this work for /private/print too? As I think that's what he needs ( I might be wrong though :p )
     
    Danltn, Aug 9, 2007 IP
  10. web_mehul

    web_mehul Banned

    Messages:
    248
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #10
    If print is into a root directory than he has to give /print/ only he need not to give /private/print/ :)
     
    web_mehul, Aug 9, 2007 IP