1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

All except One - Robots.txt

Discussion in 'robots.txt' started by .TIEU, Aug 28, 2010.

  1. #1
    I was wondering if it was possible to create a robots.txt that'd disallow bots from access all directories except one. It would also allow all access to directory. inside that directory.

    Would my robots.txt would?

    
    User-agent: *
    Disallow: /
    Allow: /images
    
    Code (markup):
    Or would it be Disallow:? instead of Disallow: /

    Also does anyone have any idea how to do this with .htaccess? I've been having problems with that also.
     
    Last edited: Aug 28, 2010
    .TIEU, Aug 28, 2010 IP
  2. Victoria B

    Victoria B Peon

    Messages:
    530
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #2
    It's possible, I will give you an example with 3 different directories excluded

    To exclude all robots from part of the server

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /tmp/
    Disallow: /~tieu/
    Code (markup):
    Hope it helps.

    Cheers!
     
    Victoria B, Aug 29, 2010 IP
  3. .TIEU

    .TIEU Peon

    Messages:
    68
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks for the reply but it isn't necessarily what I need. I have over 2000 directories in my index, I was wondering how I'd block 1999 of them and leave just one folder allowed.

    Would it be Disallow: / or Disallow: /* or Disallow:
     
    .TIEU, Aug 29, 2010 IP
  4. Victoria B

    Victoria B Peon

    Messages:
    530
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #4
    In this case you can use the following to exclude all files except one:

    User-agent: *
    Disallow[B][SIZE="3"][COLOR="red"]: /~[/COLOR][/SIZE][/B]tieu/online/
    Code (markup):
    Cheers!
     
    Victoria B, Aug 29, 2010 IP
  5. .TIEU

    .TIEU Peon

    Messages:
    68
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Okay this will work. But how do you know to use ~tieu? not ~index etc?
     
    Last edited: Aug 29, 2010
    .TIEU, Aug 29, 2010 IP
  6. Victoria B

    Victoria B Peon

    Messages:
    530
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #6
    That was just an example.... you put

    Disallow: /~whateveryouwant

    Cheers!
     
    Victoria B, Aug 29, 2010 IP
  7. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #7
    since you want only only one folder to be accessible, try following code:

    User-agent: *
    Allow: /Folder Name/
    Disallow: /

    This code will allow crawlers to access only 'folder name' folder and rest of the website would not be accessible. Hope this would help you.
     
    manish.chauhan, Aug 30, 2010 IP
  8. DoDo Me

    DoDo Me Peon

    Messages:
    2,257
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #8
    User-agent: *
    Disallow: /
    Allow: /images

    will work. The only problem is, you have to have back link to your /images from other website. otherwise decent bots can never get there.
     
    DoDo Me, Aug 30, 2010 IP
  9. .TIEU

    .TIEU Peon

    Messages:
    68
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    How about if I do this, would it allow the folders not to be browseable but the index page is viewable?

    User-agent: *
    Disallow: /
    Allow: /images
    Allow: /index.php
     
    .TIEU, Sep 2, 2010 IP