Too Many Indexed Pages

Discussion in 'Google' started by premierfabio, May 25, 2011.

  1. #1
    My site has over 90,000 urls were indexed by Google. Lots of them are useless. Do you know any method to make them excluded by Google? I already tried disallow them in robots file and add noindex tag on pages a month ago, but most of them are still indexed. Do you know any method to make them out in a short time?

    Someone suggests me remove them in Google Webmaster Tools, but GWT says only 404 and 401 pages can be removed from the tool...

    Appreciate your time!
     
    premierfabio, May 25, 2011 IP
  2. newlogo

    newlogo Peon

    Messages:
    3,931
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    0
    #2
    hi remove those page from your server and redirect those link to your home page
     
    newlogo, May 25, 2011 IP
  3. webcosmo

    webcosmo Notable Member

    Messages:
    5,840
    Likes Received:
    153
    Best Answers:
    2
    Trophy Points:
    255
    #3
    you have to do a 404 on the server for those pages. then google will take some time to deindex them.
     
    webcosmo, May 25, 2011 IP
  4. Jeff Collision

    Jeff Collision Peon

    Messages:
    1,020
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #4
    You can redirect or remove URL request from Google webmaster tools. These are the options to remove.
     
    Jeff Collision, May 25, 2011 IP
  5. newbie191

    newbie191 Notable Member

    Messages:
    1,961
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    260
    #5
    I am having similar problem. I recently migrated from wordpress to blogger and while my blog was on wordpress lot of duplicate pages were indexed in google e.g. mysite.com/tag/.
    Now google webmaster tools is detecting those pages slowly but steadily and deleting them. These pages show up in my crawl errors page in webmaster tools.
    @OP I suggest you use google webmaster tools. That is very useful. But I think there is no way to manually remove these pages from index. It will take time.
    I hope my serp will improve after all these pages are deleted from index. correct me if I am wrong :p
     
    newbie191, May 25, 2011 IP
  6. bogs

    bogs Active Member

    Messages:
    2,142
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    80
    #6
    robots.txt is enough.. you might just miss something on it.. try to use a validator..
     
    bogs, May 26, 2011 IP
  7. xcdear213

    xcdear213 Peon

    Messages:
    210
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    robots.txt file on the search engines crawl guide is great, if it does not work, then do not in your site to do those useless links on the page, so there is not a robot to crawl
     
    xcdear213, May 26, 2011 IP
  8. Jbone

    Jbone Greenhorn

    Messages:
    56
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    16
    #8
    robots.txt file will not stop the pages from appearing in Googles Index as they have already been indexed.

    If you no longer want the pages remove them from your server and 301 re-direct the url to the homepage or a page with similar content.

    Its not an issue having that many pages in the index anyway because Google will probably consider most of them useless and put them in the supplementary index.
     
    Jbone, May 26, 2011 IP
  9. aileenwuornos

    aileenwuornos Peon

    Messages:
    288
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #9
    you can remove those files from server otherwise robot.txt is sufficient to exclude url from search engines
     
    aileenwuornos, May 26, 2011 IP
  10. gameutopia

    gameutopia Peon

    Messages:
    975
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    0
    #10
    If your site is static like html pages you can add noindex to the header. If it is dynamic database driven you can remove the posts and wait for google to not find them, add them to robots.txt is borderline not worth it. Once they are removed from the site or noindexed google will eventually remove them. It takes some time especially if you have a lot of them. You can request removal in webmaster tools, but best if you also remove them totally from your hosting.
     
    gameutopia, May 26, 2011 IP
  11. C.Rebecca

    C.Rebecca Active Member

    Messages:
    1,401
    Likes Received:
    11
    Best Answers:
    1
    Trophy Points:
    65
    #11
    1. Remove those URL from your website
    2. Put 404 error with those URLs
    3. Use Google removal tool
     
    C.Rebecca, May 26, 2011 IP
  12. turket2

    turket2 Greenhorn

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    16
    #12
    That will work, a simple dofollow after they have been indexed won't really help
     
    turket2, Jun 13, 2011 IP