let me start off by saying that i am somewhat new to SEO, and don't clearly understand the specifics of SEO, nor do I really understand how to handle the search engine bots when it comes to spidering dynamic pages. My business partner and I launched http://www.polacynaswiecie.com 2 weeks ago as a private beta. We use ajax on the site extensively, and in the event that javascript is disabled, we still allow a potential user to browse the contents of the website. Here is an example; When looking at the home page you will notice a row of thumbnails representing the newest members. To the left and right of these thumbnails are arrows allowing you to browse the previous newest members. This is enabled through ajax, but in the event that javascript is disabled on a user's machine, the user can still browse the newest members, and the page will reload and append this to the url "p1=1" making the actual url look like this, http://www.polacynaswiecie.com/?p1=1 and http://www.polacynaswiecie.com/?p1=2 for the second page, and so on and so forth....now the issue arrises when googlbot spiders the website. I noticed in google's webmaster tools, as well as in google, that it spiders the homepage as many times as there are unique number of pages with the newest members, as well as unique number of pages with the newest photos. So given that there are 10 rows of newest members, the homepage is spidered 10 times differently, although most of the content is identical, aside for the info on those newest members.....is this good? If it is not good, how can i prevent google from following certain links from a page only....will a sitemap prevent that? The second question is as follows. Should I allow google access to all dynamic data, i.e. all user profile, all job postings, all classified postings, all business directory postings, all forum postings....and should I use the titles of such postings as the meta titles for those pages (they can be dynamically assigned) and the desciptions of these postings as the desciption meta data? If that is the correct thing to do, should I also change the website display urls from let's say index.php?pid=3 to friendly urls such as index.php/this is a title of a posting Thank you for any help and suggestions.
Thats sounds like a tricky question. I don't have a specific answer for you...but try some of this out... You should try using your robots.txt file to disallow access to pages that are mostly duplicate content and not useful for searches. Make sure you are using Google webmaster tools and submit an accurate sitemap file. Monitor the pages that get indexed in Google and adjust your robots.txt file and sitemap.xml accordingly. That should help you maintain a good index in Google and prevent duplicate content penalties.
hmmmmm, I'm not really sure how to do this....the problem is not with google indexing pages I don't want it to index. The problem is Google indexing the homepage as many times, as there are rows of newest members, and newest photos, because these are dynamic and can be browsed by appending a ?p1=1 ...p1=x to the index.php (home page) file. So the content isn't the same, as the names and locations of people are different, but everything else on the page is identical.
Has Google already indexed these 'additional' pages? Or is this something you are worried might happen in the future? Cause if it isnt a problem yet, maybe google is smart enough to figure it out on its own. Update: I see it IS already a problem for you (indexed pages) You should be able to correct this with your robots.txt file.
this is going to be very difficult to do, and probably impossible as the sites gains popularity. Currently there are 1070 photos uploaded. With 9 photos per row displaying on the homepage, this means there are actually 118 browsable instances of the homepage with the url being index.php?p2=0 ...... through index.php?p2=118....and as more pictures are uploaded by users more of these brosable pages are being added. This is only with about 100 users. Imagine how quickly the number of browsable instances of the homepage will grow with 100,000 members.....
HA, such a simple solution, but I have another question....do you think it would be a good idea to let the bot spider each instance, since each instance will contain different first names, last names, city and country, and sex.....and each photo will also have an album name and poster's name associated with it.
There's a chance that it could be beneficial. If you have enough unique info for each page, than go for it. Make sure title tags are unique, and as much unique text on the page as you can afford. Otherwise the pages will be seen as duplicate content and hence ignored.
so let's say there is some unique content, but also majority of identical content, will these pages only be ignored, or will we be penalized?