I am left wondering after trying to find an answer to this question. Does http://site.com count as a webpage and would http://site.com/index.html be counted as duplicate content? When I look at the site: to a few of my sites google does not list the http://site.com/index.html or http://site.com/index.php. It shows up as http://site.com/ in the results. So, when I am creating a sitemap for my site. Would it be good to just list the http://site.com and not the http://site.com/index.html in it? It has me confused. This one: <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> <url> <loc>http://www.site.com/</loc> <priority>0.1</priority> <lastmod>2007-05-15T16:22:21+00:00</lastmod> <changefreq>yearly</changefreq> </url> </urlset> Code (markup): or this one? <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> <url> <loc>http://www.site.com/</loc> <priority>0.1</priority> <lastmod>2007-05-15T16:22:21+00:00</lastmod> <changefreq>yearly</changefreq> </url> <url> <loc>http://www.site.com/index.html</loc> <priority>0.1</priority> <lastmod>2007-05-15T16:22:21+00:00</lastmod> <changefreq>weekly</changefreq> </url> </urlset> Code (markup):
Well it does not really matter. But better to not reference more than one page that has the same content. So this means don’t use index.html version in your site map. It wont kill you if you do. Best to just use an automated sitemap generator like www.xml-sitemaps.com to do all the work for you. In short, YES they are the same exact page. Your web host has it set to default the index.html file to the root of the folder, in this case to site.com When you look at stats in google sitemaps you will often see some inbound links with the specific page full url. So google will treat this different than a direct link to the www. or non www version of your site.
This I do know that www.site.com is not site.com and is counted as a separate duplicate content page. My question was more to site.com/index.html the same as site.com in googles eyes. Just to resay what you just said to me. So, site.com/index.html will look like site.com in googles eyes and will count as duplicate content and it is best not to put index.html in the sitemap. The site has two pages to it and I did not need a sitemap creator for it. A bit overkill for a two page site. .
It is best to focus on building up site.com rather than site.com/index.html. When linking back to your homepage, and when having other sites link back to you, use absolute linking by placing <a href="http://site.com"> and not <a href="http://site.com/index.html>. This will help site.com to appear in search engine rankings. Don't worry about site.com/index.html being a duplicate content (is it a duplicate content??) just focus on the site.com
No, it is not duplicate content in a way same topic as all the other websites and might bare some closeness. Back link building is a must, I understand. Also, there is no way I would link to index.html over site.com. I went ahead and left out the index.html from the site map. So now it just reads as site.com. Google was fine with that.. 39th place in google for the keywords. After I get back links i think it will be do just fine. The good thing is I bought the domain. Created the sitemap and submited it on the first day to google and then two days later it was in the search results. So, no index.html for me. Still waiting on yahoo and msn to pick it up. I know that is going to take longer.