today i came across a webpage where it was mentioned that crawl to index ratio should be 3:1.so i want to ask that Is there any difference between Indexed or crawled pages? or they are the same .if they are differnt then what is the good ratio and how to improve them?
Crawled means the spider has scanned your page. Indexed means the spider placed the page into the index (a.k.a. database). I would not worry of ratio but instead on proper server structure, sitemap implementation, and backlinks to the various pages of your site are done, to ensure the most pages possible are crawled and indexed.
I always thought it was funny that de-indexed (punished) pages/sites were still crawled regularly. It's almost as if the they want the webmaster to keep their hopes up seeing all the spiders. On the other hand, I guess they have to keep crawling them to see if worthy of re-indexing. John
It might help to understand spiders a bit more. The spiders crawl servers, read the pages on the server, and "happen" to follow the links they find. I say happen, as it is not a given that they will or can follow the links. A broken link won't get followed. A page with major coding flaws typically won't be read. Hope this helps!
thanks for guidance.can u tell me how to check for crawled pages apart from server stats.as we can check for indexed pages in google
My favorite ratio is 100%. And that is very easy if you have a one page site. If you have a homepage and two sub pages, your homepage will share it's importance with those sub pages; it gets split up, divided by two. So, the two sub pages will be much less important than your homepage. And a bit less likely to be indexed, but still attainable if you have some links to your homepage. The more sub pages you have, the more divided is the importance from your homepage to your sub pages. Make sense? So, any stated ratio without knowing the total pages is meaningless. Like I said, I get 100% all the time, but that sounds like BS until you find out that my sites have only one page. Bompa
Once a crawler visits a page, it follows all the pages through the links and sitemap. If your site is indexed by a search engine, you can be sure on that all of the pages were crawled except nofollow links and disallowed pages by robots.txt. Sorry but I see no reason to know which pages were crawled. However you can search for some traffic trackers if your server stats are not enough for your need.
When spider visits webpage it is called crawling but not its no necessary that the page being crawled is indexed. you can see the indexed pages by site:domain . com . If you see in control panel the bot is visiting your website regularly but the pages indexed are shown within some time frame.