I heard someone say that if you write www.yourdomain.com/index.html you should not see yourdomain.com He said that it could be counted as duplicate content. Is this correct and if so how can I fix it. He mentioned something about htaccess and re-running a mod whatever that means. Can someone please explain what he was talking about. Thanks!
what this person is talking about is canonicalization. Basically the idea is to only show visitors as well as SE's 1 version of any page on your site. SE's see the following as all being "different" domain.com/ domain.com/index.html www.domain.com/ www.domain.com/index.html So what you want to do is pick one and stick with it. I recommend www.domain.com/ as this is the way most people are going to link to your site. the bit about htaccess is setting up 301 redirects for the other options to make sure that the URL is always rewritten to your chosen format. This ensures when someone improperly links to your site using one of the other options that the person clicking the link will be sent to the proper version. It also helps with ensuring that all the linkjuice from that link is passed to the proper place. what you want to do is set up a mod rewrite rule in the htaccess file. Additionally you may want to include the meta canonical tag in your document head. this will cue google and other SE's which version you prefer it to index and can help with dupe content problems. For more information on mod rewrite and canonicalization check out some of these articles. Canonicalization: http://en.wikipedia.org/wiki/Canonicalization http://www.mattcutts.com/blog/seo-advice-url-canonicalization/ http://www.seobook.com/canonicalization-missing-manual Mod rewrite: http://www.modrewrite.co.uk/mod-rewrite/canonical-urls-with-mod-rewrite.html http://httpd.apache.org/docs/2.0/misc/rewriteguide.html Hope that helps answer your question
How does one accomplish the redirects? Are we talking about creating redirects in the domain manager of your host? For example, I am using cpanel to manage my host is this where I would manage this.
If you are hosted on an Apache web server (Linux or some other flavor of Unix) then you likely have access to Mod Rewrite and several other tools that could be used to accomplish this. I prefer Mod Rewrite. It requires learning about regular expressions but it's not that difficult once you get the hang of it. There are lots of books that you can pick up at Borders or Barnes and Nobles to learn this... Or you can fumble thru the Apache docs online. I would look at the 1.3 docs for an explanation of the order in which RewriteCond and RewriteRules are executed... I tried to explain this in a recent post here. For example, if I chose http://www.example.com/ as my Canonical URL in the post above, I could create a .htaccess file for Mod Rewrite and place it in the root of that looked something like this: DISCLAIMER: I haven't tested this... Just winged it off the top of my head... but it should be close. It should take care of the 3 redirects I mentioned above for the home page AND any folder or subfolder that has an index.html in it as the default document.
Simple added : <link rel="canonical" href="http://anhblog.net/"> Replace http://anhblog.net/ by your url . It's help google bot don't mark it such as a duplicate version
<link rel="canonical" href="http://www.example.com/"> in the <head> of your home page HTML will fix the canonical issues... Ummm... at Google, Yahoo!, and MSN/Live/Bing. But what about the all of those other less sophisticated engines that don't support it? You're screwed. <link rel="canonical"> was designed for sites that don't have access to something like Mod Rewrite or server side scripting to implement 301 redirects like those running pure HTML sites on IIS or for very large ecommerce sites where it would be very difficult to implement canonical URLs w/ 301 redirects because they have LOTS of query string parameters (and the same query string name/value pairs in a different order are seen as different URLs even though they are rendering the same page). I heard Matt Cutts in person at Pubcon November 2008 say this and that the new <link rel="canonical"> element should be used basically as a last resort. Besides only being supported by basically 3 engines, another major problem w/ <link rel="canonical"> is that it still shows the non-canonical URL in the browser. How do you think most webmasters who want to link to a page on your site get the URL for the link? They: 1) go to your site, navigate to the URL they want to link to, copy the address out of the browser, and paste it into an <a href="paste it here"> element on their site OR 2) they follow a link from another site to the URL on your site they want to link to, copy the address out of the browser, and paste it into an <a href="paste it here"> element on their site . Depending on how they navigate to the page or which link they follow to the site, they may copy a canonical or non-canonical URL to make their link. By NOT implementing 301 redirects and using <link rel="canonical"> you are perpetuating the use of non-canonical URLs because you continue to show them in the browser address bar. 301 redirects solve this problem. No matter which link they follow on your site or another... the canonical URL is ALWAYS displayed in their browser so going forward almost every link to your site will likely be created with the canonical URL. If it's not, the 301 redirect changes the browser so it sees the canonical. 301 redirects are STILL the prefered way to implement canonical URLs. You should learn to do it the right way instead of taking the easy way out by using something meant for those who cannot implement 301s. What are you going to do when you move a page from URLA to URLB. Putting <link rel="canonical" href="URLB"> in the head of URLA might give URLB credit for the inbound links but it is NOT going to redirect the user from URLA to URLB. They will still see the old page... So credit for inbound links to URLA is transfered to URLB... Great! But the browser is still showing the old URLA AND the content displayed will be that of URLA NOT the new page at URLB. If you implement a 301 redirect when a page moves from URLA to URLB not only does URLB get credit for all of URLA's inbound links but URLB is ALWAYS shown in the browser address bar regardless of whether URLA or URLB is requested. Learn to do 301 redirects... It's the proper way to fix canonical issues. Use <link rel="canonical"> ONLY as a last resort.
Cannonical, Repped u +++, Excellent info , this is the best thread to understand the redirect and cannoical issues. I have bookmarked it also .
Thank you very much for explaining it to me. If I am not mistaken I need to add about 2 lines to my htaccess file. Could someone please tell me exactly what I should write for http://www.footballreporter.co.uk Also I need to know that the folders I have on that the index pages I have in other folders on that site will not be redirected to the home page of the site. Thanks!
It looks like you are already redirecting requests for: http://footballreporter.co.uk/index.html http://www.footballreporter.co.uk/index.html to: http://www.footballreporter.co.uk/ However, when someone request a non-www URL (for example, http://footballreporter.co.uk) you are still rendering the page under the non-www URL. You should be 301 redirecting all such requests to the www version of the URL. So I would recommend adding the following to the bottom of the .htaccess in your in the root of the web: Now if someone requests (used example.com so you wouldn't end up w/ a bunch of 404s in your Google WMT): http://example.co.uk http://example.co.uk/folder/ http://example.co.uk/folder/subfolder http://example.co.uk/page.html http://example.co.uk/folder/page.html http://example.co.uk/folder/subfolder/page.html to: http://www.example.co.uk http://www.example.co.uk/folder/ http://www.example.co.uk/folder/subfolder http://www.example.co.uk/page.html http://www.example.co.uk/folder/page.html http://www.example.co.uk/folder/subfolder/page.html respectively.
Thank you for all the information, especially a big thanks to Canonical for the detailed explanations. I spoke with some nice people at an SEO company called High Position at a fair and they helped me solve the index problem, but then I decided that I wanted all the options to go to http://www and as I could not figure it out myself, the help I got here was exactly what I needed and much appreciated. So once again thank you very much. I hope other readers will find this useful too.