Hello, i've been asked by a friend to assist with a new website that contains more the 100,000 pages. since the site is new, it hasn't been fully indexed yet, and by the rate of the google crawler, it will take about a year or two to finish crawling the whole site, if no new pages will be added. I've tried sitemaps option, but still too slow. and since the site is new, there is no manualy option to set crawl speed on the webmaster tools. Is there a way to speed up crawling?
i'd love an answer too, i recently did a site which had about 17,000 pages but due to poor crawl rates, I was forced to nofollow a lot of it and reduced to about 800 pages now. you can up the speed of crawling in google webmaster tools - but it does not seem to obey it and would still crawl really slow. any ideas? what's a best practice here... I typically deal with sites that are of no more than 1-2k pages and this is a little out of my depth...
The best idea is indexing large website by Social Bookmarking. like digg, delicious etc it will be indexed by google with in 1-2 days. Hope it will be help you
@dimitar christoff It's up 2 you. Tweet also good @ vmxer you can do it category wise suppose you have xyz category 100 pages jsut bookmark that category url
great - so which social bookmarking services would you recommend for a site that offers a service (domestic / office cleaning). I had a look around digg, stumbleupon etc and they don't have an appropriate category that can take my links! and yes, I don't plan on bookmarking 16k pages but some deep links that google has failed to reach yet will help matters
do the directory submission for your site and when you make details just use the site map url for listing it can be handy too.
interesting. i presume a html interface / version of the sitemap - but won't that just add weight to that page which is local, links to inner pages?
iterations.... for instance, i have a 'small' site with about 30 services or so. these services are applied on over 1000+ postcodes (which are also broken down into postal towns and counties), all added as iterations of the urls, titles and h-tags / body texts, and we have 30k pages and more...
Is their an easy way to create the pages? An Automated script perhaps? I like the idea, I had a great idea for a city state town script. Also what is the site Ide like to take a look.
sure, its www.cleaning-4u.co.uk - i have disabled the subpages for postcodes and nofollowed beyond county leel now to reduce sitemap to 800 pages for the time being (google has currently spidered 225 of them only, before it had spidered 1200 of 17000 with all postal areas... sigh).
I looked at your site fragged, was that an old quake site? I like the moo tools stuff. I booked marked your site its a good one. I checked out the cleaning site and PR0 is it new? You can set crawl rate in Google Webmaster tools.
yeah, it was an old quake site, bought the domain 10 years ago when playing quake1 - and yes, mootools is the path to joy, i do like coding for it cleaning site is a new domain, yeah. it's not about the crawl rate - google has thus far been to the site this many times: [root@s15272300.onlinehome-server.info] /www/cleaning-4u.co.uk/statistics/logs > grep -c Googlebot access_log.processed; 32152 yet it still has only 225-indexed pages of the sitemap (which is now reduced to 800 urls)... go figure, really pissing me off...
Was wondering the same thing myself. Can we get a link to the actual site please? Are you able to break it down in subsites? I'm wondering if maybe breaking it down into subsites and running an offsite sitemap indexer on each separate subsite would be helpful.
Cool Im and old quaker Had a site got over 1 million hits in the first year. On thing about google crawling big site is age of site, big factor. The second thing is advertisements too many and your screwed. Off site Links to internal pages will get the unspidered pages indexed. Content is also one of the buggest factors, have the same content on alot of pages and boom no indexing. Page titles and descriptions have to be different and relevant. Pagerank and traffic will also effect indexing. A pagerank 0 site wll not have a that many pages indexed. Lastly check that if you run a site map you dont get any parsing errors. From one old Quaker to the next I hope this helps. Carmac rules! Nothing like Quake 1 2 or 3, 4 was ok but required to much of a machine. Need anything else just let me know. Best way to contact me is via web site in sig. See ya P.S. My site had over 500 pages indexed, took it down to 18, bigger is not alway better to mutch PR distribution.
cheers for advice - and yes, i do try to vary content, site description is dynamic and different, so are keywords for most services, locations always go into them and vary them etc. btw, try www.quakelive.com - its quake3 redone - runs in a browser now! open beta atm.
i have to look into helping my site, but my main problem is that my pages are being slotted into the supplemental index.