Just create a file named robots.txt, put it on your root directory of your website(where index.htm/index.php is), edit with a text editor and add these: 1st line: User-agent: * [The * means that you refer to all crawlers. if you want to refer to a specific crawler just replace * with its name(like "googlebot")] 2nd and every other line: Disallow: *something* eg. Disallow: / this means that you block crawlers from all your pages. if you want to block a specific page like "contact.htm" you should put: Disallow: /contact.htm if you want to block a complete subdirectory like "/admin/" you should put: Disallow: /admin/ if you want to allow everything you should type: Disallow: (Note: you can have only one rule on each line)
May be could be interesting to read how google treats and likes robots.txt I cannot post links, just learned it , but the page can be easily found by a little search with this keywords "google robots.txt specifications"
Hi, Most of the time , it depends in what pages you want to get access to crawler & what pages you don't.
Hi, Thanks for your useful post, here is a free online tool which lets you generate a robot.txt file! http://wgtools.com/seo-tools/robots/
This is my robots.txt, you can follow: sitemap: http://freebestarticles.net/sitemap.xml sitemap: http://freebestarticles.net/sitemap.xml.gz User-agent: * Disallow: /cgi-bin/ Disallow: /go/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /author/ Disallow: /page/ Disallow: /wp-images/ Disallow: /images/ Disallow: /backup/ Disallow: /banners/ Disallow: /archives/ Disallow: /trackback/ User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Mediapartners-Google Allow: / User-agent: duggmirror Disallow: /
john you have nicely explained how to create robot.txt file. I will add up to it some thing more. After creating upload it in the root directory and then check whether you can see it. for eg. www.site.com/robot.txt and hit enter you can view your robot.txt file.
Hi Robot txt file is for our website content or internal page not appear to search engines to crawl. that is the reason we are using robot txt file. when search engines are crawling a website they are searching for robot txt file after watching the file instructions search engines will crawl the websites.
you could use/edit robots.txt in GA tool but it's not apply to your website. You have to edit file robots.txt in host. ex: seta-international.com/robots.txt.