I have a big problem. The MSN spiderbot and yahoo spiderbot has been spidering a few pages of my sites rather excessively. The pages are pretty useless pages: http://www.internetmlm.net /Members_List-index-letter-D-sortby-uname-authid-584ee0d621f48c9ab0b4a7d8241daaf5.html /Members_List-index-letter-O-sortby-uname-authid-329dd717c93377ceb91190d411e82a0c.html The spidering is so bad that my webhost even shut down my entire site for consuming too much CPU. I have since relocated my site to another webhost. Top Process %CPU 17.0 [www.internetmlm.net] [/Members_List-index-letter-L-sortby-url-authid-557f90d9d3aa] Top Process %CPU 14.0 [www.internetmlm.net] [/Members_List-index-letter-All-sortby-url-authid-c14de1784c] Top Process %CPU 12.8 [www.internetmlm.net] [/Members_List-index-letter-X-sortby-url-authid-eda30773091f] How do I stop them from accessing these pages? Tried disallow members* but it didn't do the trick.
try putting this inside of the header of your webpage... <meta name="robots" content="noindex"> the above should tell se robots not to index the particular page...
Are those session ID strings there? If so that could be your problems. Session IDs can sometimes make spiders get stuck, especially if the ID changes every time it hits. Find a way to turn off the session IDs when the bots hit and this problem might just go away that fast.
Sorry if going a little off topic but is the MSN bot the same as the Yahoo bot. On my Sites Admin stats the Yahoo bot is constantly there under multiple IP addresses gobbling up heaps of bandwidth. I never see any reference to the MSN bot. However in my Cpanel stats Yahoo doesn't show up at all but MSN does. The Google bot has been showing up twice a day for the past two months.