Hi All, We are facing strange problem while removing URLs from Google Index. We are working on a site in which there is a search feature which generates duplicate URLs of static product pages. So to avoid this duplicity we have blocked search engine robot from accessing any URL generated by search feature. Now we have to remove those URLs from Google Index which are generated before using robot.txt file and are creating duplicity. So we are using inulr: google opertaor to find URLs and removing them using Google Webmaster tool. Its fine so far. Problem is there are thousands of URL in Google index for that search query and Google does not show them all, it shows only 200-300 daily instead. But we want to get rid of them ASAP so that we can improve the rank for target keywords. Is there any way(any tool) to find all the URLs for that search query so that we can remove them at once?
Maybe you should simply tell google to ignore the URL parameter used to generate the search pages? you can set this up in webmaster tools.
Hi Ingame, thanks for quick reply. We have now blocked the new URLs generated by search queries. But we have to remove URLs generated before that, which are already present in Google Index. Will this solution work for those URLs also?
In theory Google should ignore everything that goes after the parameter, so duplicate content shouldn't be an issue. I'm not sure whether these URLs will get removed from the index, though. Alternatively you can add a noindex tag in the header of those pages - this should remove the URLs from the Google index, but it will take some time for Google to recrawl the pages for changes to take effect.
Could you not do a 301 redirect on the duplicate urls? This would only be easy if the duplicate url had a consistent form, ie ../socks/. This should give Google additional information.
Hi Guys, My problem is to remove old URLs from index immediately. We have already blocked Google from accessing new URLs. Google is also removing URLs automatically but as you said it takes time. But we need to do this ASAP.
Having similar issue here. Am stuck from 3 months with an old blog which is indexed and diluting all my efforts with the new one. I have blocked it via robots.txt as well as given remove URL from the google webmaster tools. Still I have 24 pages that shows Not found and my site is ranking on the last page of google.