googlebot weirdness....general SEO question

Discussion in 'Search Engine Optimization' started by 3magine, Jan 19, 2008.

  1. #1
    let me start off by saying that i am somewhat new to SEO, and don't clearly understand the specifics of SEO, nor do I really understand how to handle the search engine bots when it comes to spidering dynamic pages.

    My business partner and I launched http://www.polacynaswiecie.com 2 weeks ago as a private beta. We use ajax on the site extensively, and in the event that javascript is disabled, we still allow a potential user to browse the contents of the website.

    Here is an example; When looking at the home page you will notice a row of thumbnails representing the newest members. To the left and right of these thumbnails are arrows allowing you to browse the previous newest members. This is enabled through ajax, but in the event that javascript is disabled on a user's machine, the user can still browse the newest members, and the page will reload and append this to the url "p1=1" making the actual url look like this, http://www.polacynaswiecie.com/?p1=1 and http://www.polacynaswiecie.com/?p1=2 for the second page, and so on and so forth....now the issue arrises when googlbot spiders the website. I noticed in google's webmaster tools, as well as in google, that it spiders the homepage as many times as there are unique number of pages with the newest members, as well as unique number of pages with the newest photos. So given that there are 10 rows of newest members, the homepage is spidered 10 times differently, although most of the content is identical, aside for the info on those newest members.....is this good? If it is not good, how can i prevent google from following certain links from a page only....will a sitemap prevent that?

    The second question is as follows. Should I allow google access to all dynamic data, i.e. all user profile, all job postings, all classified postings, all business directory postings, all forum postings....and should I use the titles of such postings as the meta titles for those pages (they can be dynamically assigned) and the desciptions of these postings as the desciption meta data?

    If that is the correct thing to do, should I also change the website display urls from let's say index.php?pid=3 to friendly urls such as index.php/this is a title of a posting

    Thank you for any help and suggestions.
     
    3magine, Jan 19, 2008 IP
  2. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    anyone? :(
     
    3magine, Jan 19, 2008 IP
  3. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    morning bump
     
    3magine, Jan 21, 2008 IP
  4. BILZ

    BILZ Peon

    Messages:
    1,515
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Thats sounds like a tricky question. I don't have a specific answer for you...but try some of this out...

    You should try using your robots.txt file to disallow access to pages that are mostly duplicate content and not useful for searches. Make sure you are using Google webmaster tools and submit an accurate sitemap file.

    Monitor the pages that get indexed in Google and adjust your robots.txt file and sitemap.xml accordingly.

    That should help you maintain a good index in Google and prevent duplicate content penalties.
     
    BILZ, Jan 21, 2008 IP
  5. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    hmmmmm, I'm not really sure how to do this....the problem is not with google indexing pages I don't want it to index. The problem is Google indexing the homepage as many times, as there are rows of newest members, and newest photos, because these are dynamic and can be browsed by appending a ?p1=1 ...p1=x to the index.php (home page) file. So the content isn't the same, as the names and locations of people are different, but everything else on the page is identical.
     
    3magine, Jan 21, 2008 IP
  6. BILZ

    BILZ Peon

    Messages:
    1,515
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Has Google already indexed these 'additional' pages? Or is this something you are worried might happen in the future? Cause if it isnt a problem yet, maybe google is smart enough to figure it out on its own.

    Update:
    I see it IS already a problem for you (indexed pages) You should be able to correct this with your robots.txt file.
     
    BILZ, Jan 21, 2008 IP
  7. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    this is going to be very difficult to do, and probably impossible as the sites gains popularity. Currently there are 1070 photos uploaded. With 9 photos per row displaying on the homepage, this means there are actually 118 browsable instances of the homepage with the url being index.php?p2=0 ...... through index.php?p2=118....and as more pictures are uploaded by users more of these brosable pages are being added. This is only with about 100 users. Imagine how quickly the number of browsable instances of the homepage will grow with 100,000 members.....
     
    3magine, Jan 21, 2008 IP
  8. BILZ

    BILZ Peon

    Messages:
    1,515
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Just disallow index.php?p=* in the robots.txt file
     
    BILZ, Jan 21, 2008 IP
  9. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    HA, such a simple solution, but I have another question....do you think it would be a good idea to let the bot spider each instance, since each instance will contain different first names, last names, city and country, and sex.....and each photo will also have an album name and poster's name associated with it.
     
    3magine, Jan 21, 2008 IP
  10. BILZ

    BILZ Peon

    Messages:
    1,515
    Likes Received:
    62
    Best Answers:
    0
    Trophy Points:
    0
    #10
    There's a chance that it could be beneficial. If you have enough unique info for each page, than go for it. Make sure title tags are unique, and as much unique text on the page as you can afford.

    Otherwise the pages will be seen as duplicate content and hence ignored.
     
    BILZ, Jan 21, 2008 IP
  11. 3magine

    3magine Peon

    Messages:
    7
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #11
    so let's say there is some unique content, but also majority of identical content, will these pages only be ignored, or will we be penalized?
     
    3magine, Jan 21, 2008 IP