1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Can I use robots.txt to block *.html?

Discussion in 'robots.txt' started by briandunning, Nov 2, 2005.

  1. #1
    Can I use robots.txt to block *.html? I know I can use it to block certain folders, but I also want to block certain file types.
     
    briandunning, Nov 2, 2005 IP
  2. WhatiFind

    WhatiFind offline

    Messages:
    1,789
    Likes Received:
    257
    Best Answers:
    0
    Trophy Points:
    180
    #2
    WhatiFind, Nov 2, 2005 IP
  3. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #3
    Why do you want to do this, Brian? I'm curious...
     
    minstrel, Nov 5, 2005 IP
  4. briandunning

    briandunning Active Member

    Messages:
    262
    Likes Received:
    32
    Best Answers:
    0
    Trophy Points:
    98
    #4
    Curious, as in psychoanalytically? :)

    I had a bunch of spam content that's gone but I'm trying to get the robots to know it's gone. It was all *html and nothing legitimate on the site uses *html. I'm just letting it all 404 for now, I was looking for an additional way to shout at the robots to stop indexing it. It's been gone for months but I still get thousands of daily requests for it.
     
    briandunning, Nov 5, 2005 IP
  5. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #5
    No. Curious as in the best way to solve your problem. If you have deleted all the html pages and replaced them with, say, php pages, you could do a series of redirects to re-route both spider requests and human visitors. At the very least, redirect all the requests for html to your new home page.
     
    minstrel, Nov 5, 2005 IP