robots txt wildcard query

Discussion in 'Link Development' started by ukgamblingforum, May 1, 2007.

  1. #1
    G is caching session id'd pages from my forum..

    the offending urls all look (and end) like this:
    http://mysite_.com/forums/viewthread;jsessionid=F2EE237A3DF572F724B0F305C9C5801D?thread=831
    http://mysite_.com/forums/printpost;jsessionid=1CC96D862D21028C067BCA541B52CCAB?post=1910



    Question -
    how do i wildcard the line in my robots.txt so that google doesn't cache those pages, but doesn't ignore everthing under /forum/..

    =====================
    User-agent: Googlebot
    Disallow: /forums/viewthread;jsessionid/
    =====================
    will this work correctly?

    cheers
    alex
     
    ukgamblingforum, May 1, 2007 IP
  2. trichnosis

    trichnosis Prominent Member

    Messages:
    13,785
    Likes Received:
    333
    Best Answers:
    0
    Trophy Points:
    300
    #2
    it will NOT effect . you have to use

    Disallow: /forums/viewthread;jsessionid/ $

    for example , if you have urls like /showthread.php?t=316652 , you must add Disallow: /showthread.php?$ . this will block showthread.php?t=316652 , showthread.php?t=1652 , etc. it will block url urls which includes showthread.php
     
    trichnosis, May 1, 2007 IP
  3. ukgamblingforum

    ukgamblingforum Member

    Messages:
    58
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    43
    #3
    yep - sounds like just what the doctor ordered- I have made the necessary changes..

    I'll check G webmasters soon to make sure nothing has run amok of course ;) cheers for the reply.
    alex
     
    ukgamblingforum, May 2, 2007 IP