How to Block HTML HTTPS pages?

Discussion in 'Google' started by jameswatt, Jun 13, 2008.

  1. #1
    Hi folks,

    My site has few pages in HTTPS version and few in HTTP version.
    Problem is I have linked certain pages like home page, sitemap page and services page links in Footer section of HTTPS version pages, now Google has indexed my domain as https://www.mydomain.com and yahoo and msn has indexed certain static pages which are not linked in footer section of my HTTPS page

    For example
    Linked pages in footer section of HTTPS version pages
    https://www.mydomain.com/index.php
    https://www.mydomain.com/services.php
    https://www.mydomain.com/sitemap.html

    Pages which are not linked in footer section of HTTPS version pages but still got indexed in yahoo and msn
    https://www.mydomain.com/ihome.php
    https://www.mydomain.com/toolresource.html
    https://www.mydomain.com/insertm.html

    If you click on above pages all pages will redirect to respective HTTP version pages with 302 methods

    Now big question how search engine (yahoo and msn) has indexed static html pages with HTTPS version without any link to any of my page.

    How I can remove those https://www.mydomain.com/insertm.html pages from robots.txt file or .htaccess file

    Questions
    How can I prevent indexing HTTPS version pages, excluding my landing page?
    What should I do to stop crawling my main domain with HTTPS version (i.e. https://www.mydomain.com)?



    It would be great if some one can help me on this front!
     
    jameswatt, Jun 13, 2008 IP
  2. webcosmo

    webcosmo Notable Member

    Messages:
    5,840
    Likes Received:
    153
    Best Answers:
    2
    Trophy Points:
    255
    #2
    Best and easiest way of doing this is if a url contains https, make is http version and do a 301 permanent redirect.

    There are different ways of doing this depending on what kind of server/programming language you using. Search for "301 permanent redirect" you would get lot of resources.
     
    webcosmo, Jun 13, 2008 IP
  3. Freewebspace

    Freewebspace Notable Member

    Messages:
    6,213
    Likes Received:
    370
    Best Answers:
    0
    Trophy Points:
    275
    #3
    Block the Google bot using robot.txt
     
    Freewebspace, Jun 13, 2008 IP
  4. vagrant

    vagrant Peon

    Messages:
    2,284
    Likes Received:
    181
    Best Answers:
    0
    Trophy Points:
    0
    #4
    302 redirects should be used for temporary redirections, you should use 301.

    also as said block using robots txt or meta tags for the https.
     
    vagrant, Jun 13, 2008 IP
  5. jameswatt

    jameswatt Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    I have used script so just few of my pages shows HTTPS version, but question is how search engine has crawl other pages which are not linked.
     
    jameswatt, Jun 15, 2008 IP