I have been running a blog for a year now and i noticed only 40% of the posts are indexed by Google. it's PR3 blog though
The search engines don't typically index every page of a site unless it's small and well linked. 40% is not bad. The nature of blogs is that over time as their posts disappear from the home page, disappear from the main category/tags/archives pages, and can only be found be several clicks away from the home page, if the posts did not get links from other sites while the posts were on the home page, they typically fall out of the index because of a lack of inbound links. Same thing happens with forum posts.
thanks for the info, another question, when i run site:mydomain.com query it says 2200 results however when i browse through there are only 850 why is it so ?
have you check your robot.txt file if your robot.txt file block spider to read your file then your website content can not be indexed by google may be it is one of the reason
it's not an issue with robots.txt, so far what i figured i don't have enough 'backlinks' since Google seems to only index blogs faster that have a good number of backlinks.....
If site: returns 2200 results then you should have 2200 indexed. When you browse through the results 10 or 100 at the time, Google only shows typically between 600 and 1000ish results. You have to put yourself in Google's shoes. I mean, how useful is it really for them to include 220 links to pages that each have 10 results so that you can see all 2200 results? Who searching at Google (other than maybe an SEO) is going to click through 220 pages of results? Definitely not the average user. It's a waste of their server resources to builder links. What if your site had 2,200,000 pages indexed? Would you expect them to allow you to scroll through 220,000 pages of results?
@canonical thanks for the answer i know google only allows 1000 results for a specific query. but i was just wondering why i was not able to browse through till 1000 results for my site:domain.com query... anyway appreciate your response !
One of my sites, asureimage.com has 89 out of 91 pages indexed (97%). Like yours, it has a PR of 3. Most of those pages are articles not unlike your blog, and aren't indexed directly from the site's homepage. It's nonsense to say it's typical for Google to index only 40% of a site. I suspect there are problems with the site structure that you can fix. Here are a few places you start looking... Get yourself a google (XML) sitemap. It's the best way to tell the major search engines about all the pages on your site, and a great way to inform Google that you've added new content. This free utility indexes your site and creates an XML sitemap for you. If the resulting sitemap is missing pages, this tells you there are pages that aren't linked (if it can't find them, nor can google). Make sure your blog doesn't use different methods of linking to the same content. There should be a single unique URL for each page, otherwise Google may penalise your site for duplicate content. See Google's webmaster guidelines for lots more useful info direct from the horses mouth.
^ i have like 635 posts and out of them like are 290 are indexed. for my other blogs which are small have little content like 50 posts they are indexed 100% percent. however meanwhile i am trying to clean up the google index with duplicate content which could result from tags etc. and experimenting if more and more posts of my blog get indexed. I will update this thread with results.
Do you have many pages? If we are talking a couple of hundred Google should have indexed the whole site in 12 months being page rank 3, if you want a specific post indexed try linking to it in another blog post or on an external website. Also add a site map, if navigation is bad perhaps Google cannot find it?
Sitemap XML should do the trick! Make a sitemap of your website., there are many web sites that generates sitemap for your own website for FREE!!! Sitemap will allow the spiders to index your website more easily...
me too facing the same issue...no i dont have duplicate posts....i have a blogger account...and have added feeds as the sitemap....as i cant add .xml files....any solutions???