Hi folks, My site has few pages in HTTPS version and few in HTTP version. Problem is I have linked certain pages like home page, sitemap page and services page links in Footer section of HTTPS version pages, now Google has indexed my domain as https://www.mydomain.com and yahoo and msn has indexed certain static pages which are not linked in footer section of my HTTPS page For example Linked pages in footer section of HTTPS version pages https://www.mydomain.com/index.php https://www.mydomain.com/services.php https://www.mydomain.com/sitemap.html Pages which are not linked in footer section of HTTPS version pages but still got indexed in yahoo and msn https://www.mydomain.com/ihome.php https://www.mydomain.com/toolresource.html https://www.mydomain.com/insertm.html If you click on above pages all pages will redirect to respective HTTP version pages with 302 methods Now big question how search engine (yahoo and msn) has indexed static html pages with HTTPS version without any link to any of my page. How I can remove those https://www.mydomain.com/insertm.html pages from robots.txt file or .htaccess file Questions How can I prevent indexing HTTPS version pages, excluding my landing page? What should I do to stop crawling my main domain with HTTPS version (i.e. https://www.mydomain.com)? It would be great if some one can help me on this front!
Best and easiest way of doing this is if a url contains https, make is http version and do a 301 permanent redirect. There are different ways of doing this depending on what kind of server/programming language you using. Search for "301 permanent redirect" you would get lot of resources.
302 redirects should be used for temporary redirections, you should use 301. also as said block using robots txt or meta tags for the https.
I have used script so just few of my pages shows HTTPS version, but question is how search engine has crawl other pages which are not linked.