Hi, I am preparing a traffic analysis report, on the first page, i need 5-6 values, like pageviews etc. Can you suggest me which all stats are freaking important and links justifying the same. Thanks.
IMO Google Analytics is showing the most importart values on the first snapshop. So you can use that values.
after many years of use and comparison with my G analytics data now and then and a look at all other known analyzers - I still prefer my updated/modified version of webalizer ( angolizer ) besides being mega fast for huge log files it gives me the most human useful output i need as a webmaster. besides that its free for those who have too low adsense to pay for quality tools. since the newest project to advance this fine tool now is hosted on code.google - I hope that new dynamic professional features may become available on a regular basis. one of the recent newly added features for example is to auto-create a TOP list in validated RSS2.0 format of top-visited URLs ( see example my http://www.kriyayoga.com/rss/hot.rss ) which can/could be used as a promo tool on any site to display a "hot/top" list of your most popular files ... adsense experts know how to use such SSI feeds to spice UP boring pages to get CHANGING and interesting keywords to create changing adsense to appear during the course of changing top URLs displayed during the course of weeks ... Since this topic is reoccuring every few days or weeks - i just spent an hour or so to go thru a bunch of links of various projects - just to see that I still love webalizer in updated version best
my question was never related to google analytics, awstats or anything I was referring to top 5-6 items which would give u an overview.
OK got your point - here some data that are most important out of a log analysis and why: Response Code server security and proper functioning how: Code 404 - page NOT found if that error number is greater than 1% of your overall HITS, then YOU may have a problem, like wrong links, missing files, images, etc or you have moved/renamed pages that are linked from outside to your site. ( I currently have 0.136% 404 ) A small number of 404 always will exist due to malfunctioning BROWSERS or mistyped URLs - if you verify manually your access and error logs. that number normally is BELOW 1% ! One additional source of 404 may be hotlinked pictures with incomplete URLs - it seems that some social network sites have limited URL-length, a cut-off URL results in a 404. I had ten thousands / months of such 404 a while ago. Code 403 - Forbidden that error number most likely is form hackers trying to access forbidden pages, using wrong paswords, etc this should be considered an alarm and verified visually in your access/error logs to make sure NO successful login has resulted after a serious of failed logins! Code 500 - Internal Server Error that server response usually always means you have a fatal error in your server configuration, or more likely in your .htaccess the smaller the number the more likely this is a local .htaccess of a sub/sub/folder with low traffic the higher that number the higher the folder hierarchy of the .htaccess file with the wrong configuration at root/ to level .htaccess - a configuration error in .htaccess would result in ALL surf / pageview attempts giving a 500 error. 500 means NO pages served /viewed by visitors = disappointed site-visitors except if during your own configuration attempts you have a few seconds or minutes created a temporary 500, you should always research such errors - what folder, what error - until 500 disappears entirely from your logs. daily usage - often shown as graphic usage / abuse of server if you look at the monthly overall and see a single day MUCH higher than normal that means either someone was hotlinking, clickbombing, repeatedly looping bots or download SW, etc research log of that day - eventually block that agent or IP to avoid further future abuse or server overload. same applies to hourly usage over the monthly stats you see a pattern, any extreme deviation means if much lower than normal = server down over extended period of time if much higher than normal = server abuse as a.m. in daily/hourly usage. Total Referrers referrers show you from where your traffic comes but it also shows you abuse by hotlinkers example normal would be that your preferred SE brings most traffic, typically that might be Google then Yahoo, Live, Ask.com etc. if you find referrers totally different from MAJOR SE - such as myspace.com, hi5.com, orkut.com, wikispaces.com, blogspot.com, spaces.live.com, multiply.com, blogfa.com, subdomains/profiles of stumbleupon.com OTHER than "www.stumbleupon.com/refer.php", or forums / other websites like new social networks, etc most likely these are no referrers but hotlinked pictures/graphics abuse by members of those sites. you may want to hotlink your images accordingly as such abuse very easily may grown infinite and exponentially into GB an xxGB of monthly server load. Verify all such large volume referrers by clicking on the referrer link of your stats to see the exact problem/cause of referrer-traffic. Total Usernames some stats have a separate print out of user names listed - this may show you access to admin or member sections by possible hackers if one username has excess number of accesses/logins on your site. User Agents shows bots, browsers - also bots who overload your server without bringing any results. see for unknown bots OTHER than the known and usually justified bots and newsfeed bots like Googlebot Googlebot-Image Mediapartners-Google Feedfetcher-Google msnbot-media Yahoo! Slurp YahooSeeker Ask Jeeves/Teoma askpeter_bot ia_archiver Other bots creating high traffic - a bot usually loads the full page in most cases - most "friendly bots have a signature left in their log entry with their homepage. have a look at their homepage/project and decide if that project of reason is valid the extra load. some bots other than listed above may create many thousands of page loads per months for no benefit at all. some bots may show that people tried to download the full site - like wget, Offline Navigator or others - even if you would allow such download of entire sites, keep in mind that in modern times people looking for easy money may download quantities of your site, set these site up on a free hosting account or social network or picture share host - then apply for adsense and earn money with your stolen content ... also it is quiet common that many such site-mirroring tools loop at some point of your site and may create ten thousands of connections within hours ... looping by mistake of such download bots. Entry pages - Total top URLs served these data may show you what pages are most frequently entered or used but these data also may show you abuse by automated traffic engines. example: about a year ago I noticed that 4 of my blog posts suddenly ranked very high in my TOP URL listed by my stats. knowledge of those topics made clear that in NO way could these posts be so attractive to suddenly rank TOP 10-20 on my site. analysis of the raw log details showed that a network of Chinese bots of one agent only, repeatedly surfed just those 4 pages, causing many ten thousands of page views each URL per months. action: deny access by that agent or collect the LARGE number of Chinese IPs used by that agent and ground those IPs or networks by use of iptables. Search String the ranking of search string shows you the most popular topics of your site and how people search for those pages. This output of search strings may give you valuable input for your SEO by knowing what spelling surfers typically use most often - singular or plural, spelling other than scholar spelling, etc. This output also may give you feedback on lowest number search string used = being low SEO pages in need of corrections ? Country From where your visitors come may be of interest to you if you have a forum, advertisements ( geo-targetting ) or if you apply for certain advertising programs other than Google adsense. Some require a certain percentage of site traffic being from US/ CA/ UK or so. here you see the origin of your visitors. Again this data also may give you clues to abuse of your site. A while ago one tiny country suddenly was top 10 in the country list. a verification of raw file data showed that one single bot downloaded ten thousands of pages within a few days ... causing a fake country statistic. action deny bot, block IP Another example was my earlier mentioned China traffic abuse. China suddenly ranked around Nr 3 - such sudden change of rank never can be organic and thus is cause for immediate research and action. However another feedback for multilingual webmasters may come from country list. Certain countries have a higher generic interest in certain topics you might offer. Hence if you master at least partially you may offer more content in that language to better serve an interested customership. Total Sites or IPs/ISPs from which your traffic comes This again may show you valuable feedback if a NON-ISP IP ranks high. It might be a download or hotlink site you may want to block by use of iptables. There are other valuable feedback to be obtained from stats - like lowest visited URLs = lack of interest or lack of SEO ... or Screen resolution most often used by your site visitors - to allow you to optimize page layout. It finally depends on your skills, knowledge and expertise to apply common sense using your own brain to overview the TOTAL stats data to figure out if something looks wrong and may require additional research or verification.
yea, I am doing most of wat you suggested and lot more, so I dont think i need to change anything. Thanks for your post, it would definitely help me serving the question "why?"