This is a bit random ... I know someone who is writing their own custom search engine to run over an accumulated list of about 10 million sites in a link directory. This strikes me as re-inventing the wheel, especially when he's talking terabytes for his local cache of those websites. Is there some way to use Google search to specifically search within a very large (ok, 10M isn't THAT large, but you know what I mean) subset of the internet? I know you can list sites for a custom search through Adsense for example, but I can't see that being practical for such a large set.