Hi guys. I have a medium-sized site, PR5, about 250k URL's, all are linked, all have unique content, all are in sitemaps, and my site is almost a year old. It does okay, I guess, ranking wise so I am out of the sandbox. My alexa rank is about 200,000. However, Google has not indexed all my pages, only a fraction in fact. The count goes up every day by a little, but every so often comes crashing down. Which is weird, because my actual content is increasing by about 100-300 url's a day (it's user-generated). I thought I had fixed this problem with sitemaps, but when the count of indexed pages came crashing down again yesterday for the third time, it dawned on me that solving this problem has nothing to do with sitemaps. I am wondering if the problem is because I am on a shared and overloaded server and my crawl rate is set to "Normal". (As you know Google Webmaster Tools lets you set the crawl speed.) Would jacking up the setting to "Faster" help Google crawl (and index) more pages? I'm hesitant because it might tip the server over the edge and get me suspended by my webhost. But if the risk is worth it... Basically, what I am trying to figure out is, Is there a link between crawl rate and pages indexed? Or is the number of pages indexed totally divorced from the crawl rate setting, and the problem is basically that I have a lame site with not enough pagerank or trustrank or visitors, or whatever. Advice appreciated from those who know. nick
Each site has a limit to the number of pages Google will index. The limit is based on the pagerank of the site and the number of incoming links. To get more pages indexed you need more high quality links to your internal pages.
thanks for your responses, but i don't think that's all there's too it. why would my competitor, exact same pagerank, have 250k url's indexed, and i have less than 10k? it can't be just about pagerank.
actually, my original post was too long, what i want to know is, IS THERE A LINK BETWEEN CRAWL SPEED AND PAGES INDEXED? yes or no?
Propably your competitor has more strong links. Try to add more strong links to inner pages, that can help.
I dont think PR dictates how much of you site is crawled, i have a PR2 article directory, it has close to 10K pages cached. I would think its more to do with backlinks and onsite SEO, if a lot of you pages that look similar they may not get into googles index, how unique is the meta content on your site?
I think you are confusing toolbar pagerank with real pagerank. Your competitor might have the same toolbar pagerank as you but your real pagerank will be a lot different. Toolbar PR is just an out of date estimate of your real PR scaled down and stuck on a scale of 1 to 10.
pretty unique! not just the meta, the pages themselves are unique. i worked very hard on that, for three months. my pages are way, way, WAAAY more individualized than my competitors.
a lot different? i don't see why it should be a lot different, unless he has been growing at a faster rate than me, which hasn't been the case. i have been getting new links, so has he. i'd be very surprised if his pagerank has overtaken mine. in fact, i am hoping i will get a higher pagerank than him in the upcoming export. we do about equally in the serps, even though he has 250k indexed i have less than 10k. he's getting more of the long tail though, resulting in higher traffic for him.
you mean, links from external sites to my deep internal pages? it's true, i don't have many of those. i have excellent incoming links from relevant content but they all go to my home page. i guess that's something to work on. kinda hard though because the inner pages aren't indexed. bit of a chicken and egg situation. i'll have to think of something clever. i guess it's either the deep links, or the crawl rate, or both.
Other issues might be if your server goes down during a google crawl or if you have lots of dynamic urls or session id's? If you want to pm me with your url and your competitors url I will take a look for you.
yeah, i suspect my server isn't responding well. dynamic urls, yes but no, they are rewritten in seo friendly fashion. but my server could well be creaking under the load, though the graphs google provides show that load times are getting faster. session id's, no. that's mighty kind of you! i just might take you up on that. but i wouldn't want to trouble you unless i'm desperate, so i will first set the crawl rate to "faster", and see if it makes any difference. i am in the process of copying my stuff to a backup hosting account just in case the faster googlebot onslaught is too much for my webhost to handle and i get booted off the server by my webhost. if setting the crawl rate to "faster" doesn't help get my pages indexed, i will come begging for advice. by the way, i just checked, my competitor actually has lower pagerank than i do, he went from 5 to 4 during the last export i guess, i hadn't noticed. but he now has 300k url's. it's gotta be something else than pagerank in this particular case. anyway... i'll see if jacking up the crawl speed helps. nick
Maybe he has lower PR for his homepage but thousands of inbound links to his category and other internal pages? That helps get pages indexed.
Google changes the crawl rate for my site every 3 month. They allow me to switch to faster speed after the last update and clearly indicated that on 15th july my rate will be back to "normal". That indicated me that they will do a major update on 15th july which they eventually did-backlinks update now my rate is back to normal.
How long does it take Google to re-cache a newly formed site? I have been making a lot of changes and want them to be guiding the search.