I recently found that one of my vbulletin installations was generating an infinite number of pages at cron.php but just slightly changing a numerical querystring... and that Yahoo actually indexed a couple thousand of these identical blank pages. Once I noticed, I robots.txt'd the cron.php file out. It's now been about 10 days and the duplicate/blank/useless pages are still indexed by Yahoo. I know 10 days isn't a long time but I was just wondering if anyone else has been through this and, if so, how long it took for a disallow in robots.txt to remove a page from the Yahoo index?
A long time. I have 14,000 pages in a cgi-bin which I no longer have, which Yahoo still has indexed, even though I added that path to my robots.txt months ago. Just have to ignore them I guess... Cryo.
Hey, excuse me if this is unrelated to search listings (I can't tell whether it applies to them or not), but this appears to be a manual URL removal form like Google offers. Here is the page linking to it... I have a feeling it may not be for regular search results, even though I never use Yahoo but its parent page doesn't seem to suggest such
Well, magically it reindexed us correctly last nigh reflecting the changes I made in robots.txt so that's great. Took about 11 days (FYI). It will be interesting to see how traffic from Yahoo changes now that the site is indexed