I was wondering .. how would 'normal' robots.txt look like ? Like this ? User-Agent: * Allow: / Sitemap: http://****.net/sitemap.xml.gz Code (markup): Or like this (Which is mine right now) User-agent: * Disallow: Sitemap: http://****.net/sitemap.xml.gz Code (markup): Might be that an issue of my site not being 'backlinked', that is not showing any backlinks (although i have them) on google link: search?
sorry accept apologies but i cant stop myself asking a question what the usage of robots and what will be happen if i save robots.txt file in my website containing the code mentioned above
Friendly spiders like Googlebot, Slurp (from Yahoo!) etc. look at your robots.txt file before crawling your site to determine which files/folders you do NOT want indexed. Unfortunately, bad bots will ignore your robots.txt file and crawl anything they feel like. By default they consider your entire site indexible unless you tell them otherwise with your robots.txt. All disallows are relative to the root of your web. You cannot disallow sub-domains or particular protocols via robots.txt. Only files or folders below relative to the root of your web.
There's a good explanation of robots.txt at: http://www.google.com/support/webmasters/bin/answer.py?answer=40360&hl=en Basically, robot.txt is a request from the webmaster to the robot (such as Googlebot) to not take certain files or folder into account when crawling the site.
No. Both are different. Sitemap contains list of all the pages in your site. Bots will easily crawl all the pages you have mentioned in sitemap. If you want to restrict the bots for some sensitive pages so you could use robots.txt