1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How to stop Gugl from indexing https urls as duplicates in wordpress since adding SSL

Discussion in 'robots.txt' started by AlwaysThinking, May 9, 2011.

  1. #1
    I added SSL to my wordpress blog because I wanted a dedicated IP address for my site. Now I am noticing G has started indexing posts both as http and https. Can some one please help how to force google not to index https as I am sure its like having duplicate content. All help is appreciated.
     
    AlwaysThinking, May 9, 2011 IP
  2. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #2
    manish.chauhan, May 18, 2011 IP
  3. norprowebstore

    norprowebstore Member

    Messages:
    126
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    48
    #3
    disallow: /https something like it ?
     
    norprowebstore, May 19, 2011 IP
  4. mxicoders

    mxicoders Guest

    Messages:
    32
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    you can add that page in roots.txt page by disallow : /https - page name .... by that it s gonna disallow or not show by google ...
     
    mxicoders, May 23, 2011 IP
  5. thejimy

    thejimy Greenhorn

    Messages:
    9
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    13
    #5
    As Google has already stated in the past, there is no penalty for duplicate content on your own domain. Their engine will pick the URL from many duplicate pages, which seem most appropriate, but there is definitely no page ranking lowered or something like that, like many believe or state.

    The solution that is presented on the above linked domain, by using two robots.txt with mod_rewrite can cause more damage to your site, then if you did nothing and having the google crawler follow the http and https domain and figure out the appropriate version on it's own. Nor try to use Google's removal tool, because as it is stated in Google Webmaster Central, it will remove ALL your URL's from their index, including https and http! The same can happen with any other noindex-solution, as with using the robots-meta tag and noindex in your html code. Putting up a block is NOT the way to have the search engine understand a site in this special case.

    Better to follow the following described solution.
    There is no side-effect that could happen, by using the canonical tag like this:
    <link rel=”canonical” href=”ht_tp://www.mydomain.com/mypage.html” />
    Let all your https documents point to the http URL instead. Keep in mind, you can't control where people link either, if they choice to link to your http or your https URL. Using the canonical tag which is comparable to a 301 redirect (but which can't be used here) will let Google credit any links on your https-site to your http URL instead. No other solution will have this benefit.

    This conforms also to Google's suggestions: googlewebmastercentral.blogspot.com/2009/10/reunifying-duplicate-content-on-your.html

    (Sorry, i had to write ht_tp as I'm not allowed yet to post links - but it seemed not useful without linking to a public source confirming that, I hope that is ok)
     
    thejimy, May 29, 2011 IP