Robots.txt

aljosabre Peon

Messages:: 45

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#1

I was wondering .. how would 'normal' robots.txt look like ?
Like this ?
User-Agent: *
Allow: /
Sitemap: http://****.net/sitemap.xml.gz
Code (markup):
Or like this (Which is mine right now)
User-agent: *
Disallow:
Sitemap: http://****.net/sitemap.xml.gz
Code (markup):
Might be that an issue of my site not being 'backlinked', that is not showing any backlinks (although i have them) on google link: search?

aljosabre, Mar 29, 2009 IP

Camay123 Well-Known Member

Messages:: 3,423

Likes Received:: 86

Best Answers:: 0

Trophy Points:: 160

#2

Second one is most common

Camay123, Mar 29, 2009 IP

MrPJH Well-Known Member

Messages:: 1,066

Likes Received:: 7

Best Answers:: 1

Trophy Points:: 155

#3

sorry accept apologies but i cant stop myself asking a question

what the usage of robots and what will be happen if i save robots.txt file in my website containing the code mentioned above

MrPJH, Mar 29, 2009 IP

Canonical Well-Known Member

Messages:: 2,223

Likes Received:: 141

Best Answers:: 0

Trophy Points:: 110

#4

Friendly spiders like Googlebot, Slurp (from Yahoo!) etc. look at your robots.txt file before crawling your site to determine which files/folders you do NOT want indexed. Unfortunately, bad bots will ignore your robots.txt file and crawl anything they feel like.

By default they consider your entire site indexible unless you tell them otherwise with your robots.txt.

All disallows are relative to the root of your web. You cannot disallow sub-domains or particular protocols via robots.txt. Only files or folders below relative to the root of your web.

Canonical, Mar 29, 2009 IP

woz2 Peon

Messages:: 79

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#5

MrPJH said: ↑

sorry accept apologies but i cant stop myself asking a question

what the usage of robots and what will be happen if i save robots.txt file in my website containing the code mentioned above
Click to expand...

There's a good explanation of robots.txt at:

http://www.google.com/support/webmasters/bin/answer.py?answer=40360&hl=en

Basically, robot.txt is a request from the webmaster to the robot (such as Googlebot) to not take certain files or folder into account when crawling the site.

woz2, Mar 29, 2009 IP

jitendraag Notable Member

Messages:: 3,982

Likes Received:: 324

Best Answers:: 1

Trophy Points:: 270

#6

Allow is not a standard syntax in Robots.txt. Use Disallow with blank arguments.

jitendraag, Mar 29, 2009 IP

mp3jammer.com Peon

Messages:: 40

Likes Received:: 0

Best Answers:: 0

Trophy Points:: 0

#7

so robot.txt is same as sitemap right?

mp3jammer.com, Mar 29, 2009 IP

kutekutta Peon

Messages:: 807

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#8

mp3jammer.com said: ↑

so robot.txt is same as sitemap right?
Click to expand...

No. Both are different.

Sitemap contains list of all the pages in your site. Bots will easily crawl all the pages you have mentioned in sitemap.

If you want to restrict the bots for some sensitive pages so you could use robots.txt

kutekutta, Mar 29, 2009 IP