1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

robots don't obey the whole robots.txt file

Discussion in 'Search Engine Optimization' started by serban, Mar 26, 2007.

  1. #1
    hello everyone

    i'm new here.

    here's my problem:

    http://www.itpromo.net/robots.txt

    -----snip------
    User-agent: *
    [...]
    Disallow: /*pdf$
    Disallow: /*xls$
    Disallow: /*html$
    Disallow: /*zip$
    Disallow: /*RON
    Disallow: /*EUR
    Disallow: /*USD
    Disallow: /*NONE
    Disallow: /*ASC
    Disallow: /*DESC
    -----snip------

    this should block all the urls containing the words after *, and the ones ending with them ($)

    Googlebot and Slurp recognize this, but Teoma and MSNbot don't:

    -----log snip-----
    "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" www.itpromo.net GET /memory/a_data/1/NONE/DESC/NONE HTTP/1.0 41345 200 0 [26/Mar/2007:14:03:14 +0300]
    "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" www.itpromo.net GET /memory/a_data/1/xls HTTP/1.0 13207 200 0 [26/Mar/2007:14:03:35 +0300]
    -----log snip-----

    what are my options to block all the bots from reaching this pages, they make a lot of traffic and i want this sections to be ignored, also i have rel="nofollow" to all the internal links pointing to this kind of URLs

    i've written the detailed problem on my blog also: http://www.ghita.ro/article/23/web_robots_and_dynamic_content_issues.html (scroll down to Problems).


    thanks!
     
    serban, Mar 26, 2007 IP