1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Site: Search gives my site 17000 pages but i know the site has 6000 -whats happening?

Discussion in 'Search Engine Optimization' started by Mungo2007, May 6, 2011.

  1. #1
    Hi All

    We have an ongoing problem with SEO and i am trying to narrow down the problem. We made a change in January based on forum advice and added a canonical tag to our site so that Google knew which pages to index, prior to this google was indexing various parameters such as page numbers, sort order and session this was causing us to get a result of 17000 pages showing on a site: when we only have 6000 pages. We also set Google to ignore all of these parameters in webmaster tools. After 4 months we have seen a lot of these pages drop out of the index but we still have 17000 pages showing on a site: search.

    I have therefore been delving deeper. Our Oscommerce site is structured in the following way we have 1 admin log in with 1 main site with all products and 3 satellite sites with specified product sections visible on each site, these are all on different domains the satellite sites have been coded by our developer in different languages so that we can then add a new product on the same admin page and each site. We ensure that each description is unique to avoid any nasty duplicate content issues. I am not sure if this structure is affecting our SEO even though the product descriptions and tags are different. Anybody had any similar experiences with this type of setup and did it work for you?

    Does anybody know of a way to scrape these 17000 pages to see what is happening as at the moment Google will only allow you to scrape 1000 pages.


    Any help much appreciated

    Duncan
    SEMrush
     
    Mungo2007, May 6, 2011 IP
    SEMrush
  2. cynia

    cynia Well-Known Member

    Messages:
    890
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    140
    Digital Goods:
    1
    #2
    1st find a software spider and spider your pages
    2nd if youa re using a CMS most CMS create tags etc which create search pages , tags , bla bla creating multiple pages for your site


    thats all
     
    cynia, May 6, 2011 IP
  3. TimHillSEO

    TimHillSEO Well-Known Member

    Messages:
    36
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    100
    #3
    I had a similar issue - 1,000s more pages than there should be and here was my answer.

    I had product pages listing hundreds of offers so the page was just one: offers.php

    But offers.php listed 10 products per page and there were dozens of ways to filter the search for example:

    offers.php?pricelessthan=300&pricemorethan=200&screen=lcd

    Now Google was cycling through all the options that it found in the search forms and before you knew it offers.php was actually over 1 million pages in the eyes of Google.

    The Google told me (via webmaster tools) that I had duplicate meta tags for 500,000 pages!! I fixed this by adding some of the search terms in the meta tags. Now Google still thinks I have 1.5 million pages but it is happy that each one is unique.

    Don't know if it is your problem but worth a look.
     
    TimHillSEO, May 6, 2011 IP
  4. SEOTranslator

    SEOTranslator Member

    Messages:
    439
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    35
    #4
    The problem is well known and has been discussed in many forums. It's as doodled mentions, a same page is indexed several times with different parameters.

    In addition, looking about how you've set up the main site and satellite sites, you may have made inadvertently an additional mistake: If the satellite sites are in a subdirectory of the main site (classical methods) but your robots.txt file does not forbid access to such subdirectories, the pages are indexed twice, once for the main site and once for the satellite site. Change the robots.txt file of the main site to forbid indexing of those directories, it will not prevent the inexing of the subsites as the root of the satellite sites is in the subdirectory itself.

    The main problem you face with the duplicate indexing (both due to indexing with repeated parameters and by indexing the subdirectory as if it were part of the main site) is that you have a duplicate content issue, and are likely to be penalized for that.
     
    SEOTranslator, May 6, 2011 IP