can robots txt be used to ........

Discussion in 'Search Engine Optimization' started by fuzzbuzz, Mar 10, 2008.

  1. #1
    can robots txt be used to prevent the link juice going to some pages, but still allow the pages to be indexed?

    if so how do i do this?

    thanks
     
    fuzzbuzz, Mar 10, 2008 IP
  2. tribulus

    tribulus Peon

    Messages:
    359
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Great Question Fuzz. I asked this on another thread but never saw a response. Hopefully, someone can answer this. :confused:
     
    tribulus, Mar 10, 2008 IP
  3. marshalseo

    marshalseo Peon

    Messages:
    252
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #3
    if you want to some page index but not follow and you can use the below tag in pages as

    <meta name="robots" content="index,nofollow">

    Need not to use robots txt.
     
    marshalseo, Mar 10, 2008 IP
  4. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #4
    astup1didiot, Mar 10, 2008 IP
  5. Jaysonnhs

    Jaysonnhs Active Member

    Messages:
    280
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    58
    #5
    Jaysonnhs, Mar 11, 2008 IP
  6. dpking

    dpking Peon

    Messages:
    1,021
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Robots file will block it !!
    I will suggest you to use nofollow links .. this will prevent the juice .. but unfortunately as far as i know it will not index the URL either ...
     
    dpking, Mar 11, 2008 IP
  7. LawnchairLarry

    LawnchairLarry Well-Known Member

    Messages:
    318
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    118
    #7
    Lads, I think you are missing the point of the original question. The OP talks about the robots.txt file, not about the robots meta-tag nor about the relationship-attribute.
    Returning to the original question: In my opinion, the robots.txt file is used to exclude certain directories from being crawled by search-engine spiders. If these directories are not crawled, they will not be indexed either. And if these directories are not indexed, the links pointing to files in these directories won't be followed, hence link juice won't flow to these files, despite that the robots meta-tag and the relationship attribute may allow hyperlinks to be followed. Can anyone confirm/deny this?
     
    LawnchairLarry, Mar 11, 2008 IP
  8. fuzzbuzz

    fuzzbuzz Active Member

    Messages:
    315
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #8
    lawn - thas exactly what i mean.

    See my issue is, on some pages i have no access to the meta area, so that would mean its out the question.

    I want to be able to SE's not to view a particular page, therefore not to index it and subsequently not parse link juice that way. At the moment i have unessary pages getting juice which they basically dont need. And i wanna stop that!!! Gonna put my super robot.txt superman like outfit on and stop it happenin .... (imagination getting carried away :) )
     
    fuzzbuzz, Mar 11, 2008 IP
  9. LawnchairLarry

    LawnchairLarry Well-Known Member

    Messages:
    318
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    118
    #9
    Those are two different situations, Fuzzbuzz:

    Use the following robots.txt file for this situation:

    User-agent: *
    Disallow: /your-inaccessible-directory/

    Here's a useful tutorial to creating a robots.txt file.

    If you want a certain webpage to parse link-juice to all linked-to files, except a few, then use the following code in the referring file:

    <meta name="robots" content="...,follow"> (put this in the header)
    <a href="http://www.yourwebsite.com/your-unnecessary-page.htm" rel="follow">Your anchor text goes here</a> (parses link-juice)
    <a href="http://www.yourwebsite.com/your-unnecessary-page.htm" rel="nofollow">Your anchor text goes here</a> (does not parse link-juice)

    Note that in this situation, it does not matter if your let the file be indexed or not.
     
    LawnchairLarry, Mar 11, 2008 IP
  10. fuzzbuzz

    fuzzbuzz Active Member

    Messages:
    315
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #10
    hi lawn,

    thanks for the reply

    if i use robots.txt and your example, wouldnt that just block out any page within that directory? If i wanted it to be page specific, can i name the actual page name and file extension?

    I have some files within directories that should be indexed and some which shouldnt.

    Thanks
     
    fuzzbuzz, Mar 11, 2008 IP
  11. astup1didiot

    astup1didiot Notable Member

    Messages:
    5,926
    Likes Received:
    270
    Best Answers:
    0
    Trophy Points:
    280
    #11
    This is why the robots.txt file ISN'T your solution. Unless you block each specific page directly. Use the nofollow attribute.
     
    astup1didiot, Mar 11, 2008 IP
  12. LawnchairLarry

    LawnchairLarry Well-Known Member

    Messages:
    318
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    118
    #12
    LawnchairLarry, Mar 16, 2008 IP
  13. enous

    enous Well-Known Member

    Messages:
    1,500
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    158
    #13
    Good. thx for sharing
     
    enous, Mar 16, 2008 IP
  14. fuzzbuzz

    fuzzbuzz Active Member

    Messages:
    315
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    53
    #14
    Thanks for the replies
     
    fuzzbuzz, Mar 17, 2008 IP
  15. kingofsanda

    kingofsanda Peon

    Messages:
    8,154
    Likes Received:
    218
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Thanks for posting the link for the article.
     
    kingofsanda, Mar 17, 2008 IP