Google and Bing are indexing fake links. Please help

Discussion in 'Search Engine Optimization' started by FutureKing, Aug 24, 2013.

  1. #1
    Hi, google and bing both are indexing fake links of my client's website. These links don't even exists and they are not indexing the real links.

    Please help.

    See the screenshot below:

    [​IMG]
    [​IMG]
     
    FutureKing, Aug 24, 2013 IP
  2. The SEO Man

    The SEO Man Well-Known Member

    Messages:
    448
    Likes Received:
    11
    Best Answers:
    1
    Trophy Points:
    140
    #2
    It most likely is a problem with the .htaccess file and improper URL ReWrite settings, along with a website that is not properly developed, has hidden code errors, was improperly restored from from a back during a redesign / redevelopment process / and/or has extra unneeded data in the database.

    There are too many possibilities to account for everything, but from what I see in the screen shots, if I was a gambling man, I would put my money on something I listed above.
     
    The SEO Man, Aug 24, 2013 IP
  3. patco

    patco Well-Known Member

    Messages:
    2,035
    Likes Received:
    47
    Best Answers:
    17
    Trophy Points:
    100
    #3
    Could there be a robots.txt file that does this? You can STOP those links from indexing with such a file or anything with the Rewrite settings as The SEO Man said! :)
     
    patco, Aug 24, 2013 IP
  4. Arick unirow

    Arick unirow Acclaimed Member

    Messages:
    719
    Likes Received:
    298
    Best Answers:
    30
    Trophy Points:
    500
    #4
    There is nothing wrong with that. Bing, Google and other search engine is working as expected.
    Here are why that was happen:
    1. The domain was filled with good content in the past.
    2. The content is not available again. I have checked manually the links and it seems the content have been removed or the links have been changed.
    3. Because the content have been changed, anyone including Bots (including Bing/Google) will get 404 Page.
      I see your site has policy 'Not to index 404 pages' by using 'noindex' and 'nofollow', which is very good to handle nonexistence links.
    4. The error (not all nonexistence links removed) was happen because the use of robots.txt which stop your site from functioning properly. If bad codes in robots.txt was not there, all of your nonexistence links would disappear in just few days.
    Don't block the url using robots.txt. If you really want to block it using robots.txt, make sure to remove it first from WMT.

    Why it is not recommended to only use robots.txt? Because when you use robots.txt, Bots didn't want to crawl it and only show 'description not available'.

    Why I said your site is pretty good in dealing with nonexistence links?
    Because Your site will tell bots automatically if any nonexistence page is no longer available and bots should not coming again to index that links.
    How do the site doing that? The site doing that by implementing 'noindex' and 'nofollow'.
    This mean the url would be removed from Search Engine in next update (three days till weeks). However, your site couldn't tell Bots about that because Bots is being stopped by robots.txt
    In your case, the site already pretty good to solve 'Dead or no existence links'. However, your robots.txt messed up that purposes.
    This is why you should use robots.txt carefully.

    Now, you maybe ask, why another nonexistence pages are removed from Search Engine and some of them were not (showing description not available)?
    The answer is easy:
    Your site (theme) is working very good. When Bots tries to check that link, your site said "The page is no longer available, please don't show it again in search result". This is why almost all nonexistence url were removed from SE.
    However, your robots.txt messed up that function in some URL.
    Your robots.txt blocks SE to check few urls which should be removed from index.
    Here is the content of Robots.txt
    User-agent: *
    Disallow: /search/image/newest/q/
    Disallow: /search/psd/newest/q/
    Disallow: /search/vector/newest/q/
    Disallow: /search/psd/popular_30days
    Disallow: /search/vector/popular_30days
    Disallow: /search/all/newest/q
    Disallow: *image
    Disallow: *psd
    Disallow: *vector
    Disallow: *all
    Code (markup):
    Only link which have words 'Image', 'PSD', 'Vector' and 'All' are affected.
    So, looks carefully. There are links 'Vector' in your screenshot. This mean Bing/Google is working as expected.
    The problem is with your site. Just remove the code which 'disallow' search engine' to open that links and the problem would be gone. How could? Simple, when bots allowed to crawl it, the site would tell bots "the page already removed, please don't index it". A simple solution which is very recommended to deals with dead/nonexistence links.
    I should warn you if your robots.txt will make some if not all your posts not being indexed. I recommend to remove that code. Your site (theme) is already smart to tell search engine not to index 'dead/nonexistence' links and index only good links.

    In conclusion:
    1. Removing links by using 'noindex' in site is recommended. (just remove code which stopped the bots to check the url).
    2. Remove the links using 'links removal in WMT. The easiest solution. However, other bots (two or thousands of bots from another SE) will face the same problems as long as robots.txt is still there.
    Best solution would be:
    1. Remove 'bad codes' in robots.txt
    2. Remove links in WMT
    By doing that, your 'fake links' would disappear immediately and other bots (there are so many search engine available) will remove the links from their index too.
     
    Last edited: Aug 24, 2013
    Arick unirow, Aug 24, 2013 IP