Help understanding robots.txt

Discussion in 'Search Engine Optimization' started by crazeinc, Apr 8, 2007.

  1. #1
    My blog has pagination links that allows users and bots alike to access older posts page-by-page. I don't actually want to have those pages indexed, just the posts, so I added:

    Disallow: /?page

    to my robots.txt. Unfortunately, it has occurred to me that this may be preventing the bots from accessing the old pages (to keep the older posts indexed) altogether instead of just not indexing them.

    Is that correct?
     
    crazeinc, Apr 8, 2007 IP
  2. trichnosis

    trichnosis Prominent Member

    Messages:
    13,785
    Likes Received:
    333
    Best Answers:
    0
    Trophy Points:
    300
    #2
    i want to ask that there is a page like sitename.com/?page .
     
    trichnosis, Apr 9, 2007 IP
  3. 3POWER

    3POWER Peon

    Messages:
    28
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Use robots.txt as bellow model:

    User-agent: *
    Disallow: /ExampleFolder/
    Disallow: /examplepage.ext
     
    3POWER, Apr 9, 2007 IP
  4. crazeinc

    crazeinc Peon

    Messages:
    35
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    the entire robots.txt looks like:

    User-agent: *
    Disallow: /admin/
    Disallow: /?page

    The site is http://www.pjhyett.com
     
    crazeinc, Apr 9, 2007 IP
  5. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Your robots.txt disallows:
    - http://www.pjhyett.com/?page=2
    - http://www.pjhyett.com/?page=14

    It does not disallow:
    - http://www.pjhyett.com/posts/184-what-s-your-anti-code
    - http://www.pjhyett.com/posts/15-friday-night-s-alright

    It could be more difficult for the bots to find the pages with your posts if they do not visit the archived pages (/?page=...) though.

    Jean-Luc
     
    Jean-Luc, Apr 11, 2007 IP