Hello. I was just scrolling the tools and found 21 crawl errors. All the pages did not exist but surprisingly, they were URL's that don't even exist on my Blog. I then saw the pages that were linking to the non-existing URL's. I couldn't find anything but when I opened this page: http://www.mixupdate.com/entertainment/aisha-movie-review/ And then opened page source, I saw this : var disqus_url = 'http://www.mixupdate.com/entertainment/aisha-movie-review/ '; var disqus_identifier = '6 [B]http://www.mixupdate.com/wordpress/?p=6[/B]'; var disqus_container_id = 'disqus_thread'; var disqus_domain = 'disqus.com'; var disqus_shortname = 'mixupdate'; var disqus_title = "Aisha Movie Review"; Code (markup): The highlighted URL is the one which was giving not found error. I don't know why it has these links but I did had mixupdate.com/wordpress as address of my blog for 1 day when I just started Blogging. How can I fix this issue and remove this URL from page source ? It looks a problem with Disqus to me though
There is option in Google webmaster tool for URL removal. You can submit a request for URL removal. I have done it & got the response within 24 hours. In webmaster tool go to - Site configuration >> Crawler access >> Remove URL >> New Removal request and submit your URL.
Well that's nice . But the thing is : The Link Source remains. Google will one day scan the page linking to the 404 page again and add it to Not Found list . Why don't we remove the address from the Source Page address as I mentioned ?
Well, If you have access to the source page then you can remove the URL from there. But, if you have no access to that page and have fear that Google will find that link again in future.. so what can be the solution. can't you update your robots.txt file and prevent Google from indexing this url..
I can but that page it-self should be indexed. And as I said, the URL is basically generated automatically by Disqus so how can I remove it manually.