I have the following in my robots.txt User-agent: * Crawl-delay: 3 Disallow: /verifypic.php Disallow: verifypic.php Yet, when I do a site command on Yahoo, it lists thousands of urls (urls only - no cache or title) of this script. Could anyone explain what's going on? Maybe this is how they've increase their index size. By the way, this script is used to serve an image and is only used like <img widht="50" height="16" src="/verifypic.php?v=1124209720">
Have you had the robots.txt in place all the time, or did you just recently implement it? If the latter, perhaps it takes time for Y! to delete the disallowed URLs?
I have this very same problem. I've also noticed it with Google too. From what i've read, is that the robots.txt only tells the search engines which pages not to crawl. It doesn't mean they won't index the page without a title or description. They also 'claim' that this has no effect on your rankings.