You can try this robots.txt, but most spiders don't care about robots.txt anyways, so maybe it won't help much. And its not complete some spiders are maybe missing: User-Agent: Teoma User-Agent: Ask Jeeves User-Agent: Jeeves User-agent: Seekbot/1.0 User-agent: seekbot User-agent: EchO!/2.0 User-agent: echo! User-agent: convera User-agent: Convera Internet Spider V6.x User-agent: ConveraCrawler/0.2 User-agent: ConveraCrawler/0.9d User-agent: ConveraMultiMediaCrawler/0.1 User-Agent: Mozilla/2.0 (compatible; Ask Jeeves) User-agent: aipbot User-agent: Aqua_Products User-agent: asterias User-agent: b2w/0.1 User-agent: BackDoorBot/1.0 User-agent: becomebot User-agent: BlowFish/1.0 User-agent: Bookmark search tool User-agent: BotALot User-agent: BotRightHere User-agent: BuiltBotTough User-agent: Bullseye/1.0 User-agent: BunnySlippers User-agent: CheeseBot User-agent: CherryPicker User-agent: CherryPickerElite/1.0 User-agent: CherryPickerSE/1.0 User-agent: Copernic User-agent: CopyRightCheck User-agent: cosmos User-agent: Crescent User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 User-agent: Curl User-agent: DittoSpyder User-agent: EmailCollector User-agent: EmailSiphon User-agent: EmailWolf User-agent: EroCrawler User-agent: ExtractorPro User-agent: FairAd Client User-agent: Fasterfox User-agent: Flaming AttackBot User-agent: Foobot User-agent: Gaisbot User-agent: GetRight/4.2 User-agent: Harvest/1.5 User-agent: hloader User-agent: httplib User-agent: HTTrack 3.0 User-agent: humanlinks User-agent: IconSurf User-agent: InfoNaviRobot User-agent: Iron33/1.0.2 User-agent: JennyBot User-agent: Kenjin Spider User-agent: Keyword Density/0.9 User-agent: larbin User-agent: LexiBot User-agent: libWeb/clsHTTP User-agent: LinkextractorPro User-agent: LinkScan/8.1a Unix User-agent: LinkWalker User-agent: LNSpiderguy User-agent: lwp-trivial User-agent: lwp-trivial/1.34 User-agent: Mata Hari User-agent: Microsoft URL Control User-agent: Microsoft URL Control - 5.01.4511 User-agent: Microsoft URL Control - 6.00.8169 User-agent: MIIxpc User-agent: MIIxpc/4.2 User-agent: Mister PiX User-agent: moget User-agent: moget/2.1 User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95) User-agent: MSIECrawler User-agent: NetAnts User-agent: NetMechanic User-agent: NICErsPRO User-agent: Offline Explorer User-agent: Openbot User-agent: Openfind User-agent: Openfind data gatherer User-agent: Oracle Ultra Search User-agent: PerMan User-agent: ProPowerBot/2.14 User-agent: ProWebWalker User-agent: psbot User-agent: Python-urllib User-agent: QueryN Metasearch User-agent: Radiation Retriever 1.1 User-agent: RepoMonkey User-agent: RepoMonkey Bait & Tackle/v1.01 User-agent: RMA User-agent: searchpreview User-agent: SiteSnagger User-agent: seekbot User-agent: Seekbot User-agent: Seekbot/1.0 User-agent: SpankBot User-agent: spanner User-agent: SurveyBot User-agent: suzuran User-agent: Szukacz/1.4 User-agent: Teleport User-agent: TeleportPro User-agent: Telesoft User-agent: The Intraformant User-agent: TheNomad User-agent: TightTwatBot User-agent: toCrawl/UrlDispatcher User-agent: True_Robot User-agent: True_Robot/1.0 User-agent: turingos User-agent: TurnitinBot User-agent: TurnitinBot/1.5 User-agent: URL Control User-agent: URL_Spider_Pro User-agent: URLy Warning User-agent: VCI User-agent: VCI WebViewer VCI WebViewer Win32 User-agent: Web Image Collector User-agent: WebAuto User-agent: WebBandit User-agent: WebBandit/3.50 User-agent: WebCapture 2.0 User-agent: WebCopier User-agent: WebCopier v.2.2 User-agent: WebCopier v3.2a User-agent: WebEnhancer User-agent: Web Reaper User-agent: WebSauger User-agent: Website Quester User-agent: Webster Pro User-agent: WebStripper User-agent: WebZip User-agent: WebZip User-agent: WebZip/4.0 User-agent: WebZIP/4.21 User-agent: WebZIP/5.0 User-agent: WebVulnCrawl User-agent: WebVulnScan User-agent: Wget User-agent: wget User-agent: Wget/1.5.3 User-agent: Wget/1.6 User-agent: WWW-Collector-E User-agent: Xenu's User-agent: Xenu's Link Sleuth 1.1c User-agent: Zeus User-agent: Zeus 32297 Webster Pro V2.9 Win32 User-agent: Zeus Link Scout Disallow: / Code (markup):
The reason is the robots exclusion standard never defined allow, because the robots.txt is to exclude stuff.
You can block other spiders using robots.txt file. Also if you want to restrict certain pages you may use <meta name="robots" content="noindex,nofollow"> so that those pages will not get crawl/indexed ( if you are looking forthe same )