1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Need advice on how to block certain page using robots.txt

Discussion in 'robots.txt' started by mariush, Aug 12, 2006.

  1. #1
    On my game cheats site, www.tgdb.net, I have a "print" function, for each game platform.

    For example, I have /pc/print.php , /psx/print.php

    I'd like to use robots.txt to prevent bots from accesing these pages.

    I think this would work but I'm afraid to add it before asking advice:

    Dissallow : /*/print.php

    I got this from http://www.imdb.com/robots.txt where they use it to block certain folders, not files.

    So my question is, how would I enter a line in robots.txt, if possible, to prevent bots from downloading print.php in any subfolder of the main page, even if links have attributes ( example /pc/print.php?cheat=1000)

    Thanks..
     
    mariush, Aug 12, 2006 IP
  2. explorer

    explorer Well-Known Member

    Messages:
    463
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    110
    #2
    I've just looked at your robots.txt and I see you've not done anything about this yet. I would list the pages individually:

    User-agent: *
    Disallow: /pc/print.php
    Disallow: /psx/print.php

    etc
     
    explorer, Sep 15, 2006 IP
  3. mariush

    mariush Peon

    Messages:
    562
    Likes Received:
    44
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Thanks. That would surely work.

    I've decided to do nothing about it at this moment because Google seems to like those pages and indexed all of them. I was initially afraid that it may think I have duplicate content or something like that.

    Bandwith is not a problem at this moment .. so I'll let it "flow" for now..

    Your answer is much appreciated
     
    mariush, Sep 16, 2006 IP
  4. silky

    silky Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    The following as far as I am aware

    Disallow: /pc/print.php

    Would not block the robot from accessing the page with query string parameters such as

    /pc/print.php?cheat=1000

    Let me know if anyone knows any different
     
    silky, Apr 17, 2007 IP
  5. kirby009

    kirby009 Peon

    Messages:
    608
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #5
    good info. i tried it and it work. don't know this one.
     
    kirby009, Jun 12, 2007 IP
  6. trichnosis

    trichnosis Prominent Member

    Messages:
    13,785
    Likes Received:
    333
    Best Answers:
    0
    Trophy Points:
    300
    #6
    i think

    Dissallow : /*print.php*

    will solve your poblem. google and yahoo support *
     
    trichnosis, Jun 14, 2007 IP
  7. learn2success

    learn2success Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    i have 7 error in my Google web master tool and i have requested google to remove those pages because they are not exist any more.
    but google is saying i want to block these pages from robots.txt file as well.
    if i exclude the following would work for me.

    Dissallow : /category/jobs/marketing-jobs/
    Dissallow : /category/jobs/human-recourse-jobs/
    Dissallow : /seo-ranking-factors/
    Dissallow : /category/jobs/sales-jobs/
    Dissallow : /category/jobs/admin-jobs/
    Dissallow : /category/jobs/finance-jobs/
    Dissallow : /category/motivation/

    Please advise.
     
    learn2success, May 17, 2012 IP
  8. learn2success

    learn2success Peon

    Messages:
    18
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    learn2success, Jun 6, 2012 IP