is it possible?

shailendra Peon

Messages:: 1,225

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#1

hello friends,

suppose, i create a robots.txt file with the following entry:

User-Agent: *
Allow: /
Disallow: /index.html

Will this stop the spider from crawling the Home Page or will it be crawled?
Someone told me that http://www.xyz.com/ and http://www.xyz.com/index are both different URLs

Thanks & Regards
Shailendra

shailendra, Jan 29, 2009 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#2

shailendra said: ↑

hello friends,

suppose, i create a robots.txt file with the following entry:

User-Agent: *
Allow: /
Disallow: /index.html

Will this stop the spider from crawling the Home Page or will it be crawled?
Someone told me that http://www.xyz.com/ and http://www.xyz.com/index are both different URLs

Thanks & Regards
Shailendra
Click to expand...

Yes it'll stop the spider from crawling the Home Page.

xyz.com and zyz.com/index.html are physically the same page, however, Google considers it as 2 different pages.

manish.chauhan, Jan 29, 2009 IP

shailendra Peon

Messages:: 1,225

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#3

yes they are physically the same page and if it stops crawling the home page then why we get two entries in the sitemap for home page i.e. w/o index.html and with index.html. How PR gets distributed between the two?

shailendra, Jan 29, 2009 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#4

shailendra said: ↑

yes they are physically the same page and if it stops crawling the home page then why we get two entries in the sitemap for home page i.e. w/o index.html and with index.html. How PR gets distributed between the two?
Click to expand...

Google considers these as two separate pages. so your PR also distribute between these two pages.

To avoid this, I suggest you to do canonical optimization of your website.

manish.chauhan, Jan 29, 2009 IP

shailendra Peon

Messages:: 1,225

Likes Received:: 18

Best Answers:: 0

Trophy Points:: 0

#5

manish.chauhan said: ↑

Google considers these as two separate pages. so your PR also distribute between these two pages.

To avoid this, I suggest you to do canonical optimization of your website.
Click to expand...

i have done it...but doing what i written for robots file will do any good?

shailendra, Jan 29, 2009 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#6

shailendra said: ↑

i have done it...but doing what i written for robots file will do any good?
Click to expand...

Sorry??

manish.chauhan, Jan 29, 2009 IP

ggmittal Guest

Messages:: 27

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#7

shailendra said: ↑

hello friends,

suppose, i create a robots.txt file with the following entry:

User-Agent: *
Allow: /
Disallow: /index.html
Click to expand...

hello friend... i had done same with my website... and the result was google showed the error that says that google is uable to crawl the homepage.. the best solution for this problem is to edit .htaccess and redirect yoursite.com/index.html to yoursite.com ... it will definatly work...

ggmittal, Feb 13, 2009 IP

Log in or Sign up

is it possible?

shailendra Peon

manish.chauhan Well-Known Member

shailendra Peon

manish.chauhan Well-Known Member

shailendra Peon

manish.chauhan Well-Known Member

ggmittal Guest

Useful Searches