Google Answers Logo
View Question
 
Q: Linux Traffic Log Analyzers ( Answered 5 out of 5 stars,   1 Comment )
Question  
Subject: Linux Traffic Log Analyzers
Category: Computers > Internet
Asked by: respree-ga
List Price: $10.00
Posted: 14 Oct 2002 11:14 PDT
Expires: 13 Nov 2002 10:14 PST
Question ID: 76475
I am looking for links outlining the most widely used (most popular)
Linux-based traffic log analyzer programs.  Specifically, we need
something that would run on a server using the Debian OS.  Top 3 or
top 5 list would be sufficient. Thanks.
Answer  
Subject: Re: Linux Traffic Log Analyzers
Answered By: bikerman-ga on 15 Oct 2002 07:08 PDT
Rated:5 out of 5 stars
 
Hello,

This was an interesting question to research because this type of
statistic is very difficult to come up with.  Different surveys
have different results based on exactly who is surveyed.  In
particular, I'm very wary of surveys conducted by a certain vendor
claiming their product is most popular.  If I understood your
question, you are simply trying to determine the best log analyzer
for your own use, and are assuming that the most popular will
probably be the easiest to use, have the best features, etc.

I found one actual webmaster survey involving log analyzers.  The
survey was conducted by GVU ( http://www.gvu.gatech.edu/ ) back in
1998--that's a long time in the world of software.  I've included
more information on that survey below, but first I'd like to
present what I think is more useful and current information.

You are probably familiar with Freshmeat.net (
http://www.freshmeat.net ), the Web's largest single listing of
software and software-related projects.  The projects on Freshmeat
can be listed by rating and popularity, among other things.  I
listed the category "Topic::Internet::Log Analysis" by rating and
popularity and looked for log analysis tools.  Here are the top 5
analyzers (note that they are all free software):

Sorted by Rating:
http://freshmeat.net/browse/245/?filter=&orderby=rating_DESC&topic_id=245

AWStats
W3Perl
ModLogAn
Webalizer
analog

Sorted by Popularity:
http://freshmeat.net/browse/245/?filter=&orderby=popularity_percent_DESC&topic_id=245

AWStats
Webalizer
analog
ModLogAn
Lire

Based on these lists, one might conclude that the top 4 are
AWStats, Webalizer, ModLogAn, and analog.  Here are the URLs to
the homepages:

AWStats
http://awstats.sourceforge.net/

Webalizer
http://www.mrunix.net/webalizer/

analog
http://www.analog.cx/

ModLogAn
http://jan.kneschke.de/projects/modlogan/

Lire
http://www.logreport.org/

W3Perl
http://www.w3perl.com/softs/

Now, back to GVU's webmaster survey.  GVU didn't provide any
statistics on this particular subject, but the raw datasets for
their 10'th WWW Survey can be found here:
http://www.gvu.gatech.edu/user_surveys/survey-1998-10/datasets/

The particular dataset of interest to us is here:
http://www.gvu.gatech.edu/user_surveys/survey-1998-10/datasets/num_webmaster.zip
http://www.gvu.gatech.edu/user_surveys/survey-1998-10/datasets/webmaster_key.table

I wrote a Python script to extract the information I wanted from
the dataset, and here are the results:

Log Analyzer:				Percent*
---------------------------
Homemade            	36.4%
Other               	32.1%
analog              	29.1%
WebTrends           	23.7%
WebStat             	11.9%
wwwstat             	 8.1%
http-analyze        	 6.7%
wusage              	 5.7%
WWWStat4Mac         	 5.1%
getstats            	 4.9%
3D Stats            	 3.5%
I/PRO               	 1.6%
Net.Genesis         	 1.3%
Interse             	 0.8%
Don't Use           	 0.0%

*63 of the 434 webmasters surveyed didn't use log analyzers.  The
percentage is of the 371 who did.

Here are the URLs to the top 3:

analog
http://www.analog.cx/

WebTrends
http://www.netiq.com/webtrends/default.asp

WebSTAT  (I think this is the right one.)
http://hits.webstat.com/

At the risk of being ridiculed by real programmers, here's the
Python script I wrote :)

------Begin Script---------

#!/usr/bin/python

# Script to analyze the GVU webmaster survey results.

import string

infile = "num_webmaster.dat"

questions = {	52:"3D Stats",
							53:"analog",
							54:"getstats",
							55:"http-analyze",
							56:"Interse",
							57:"I/PRO",
							58:"Net.Genesis",
							59:"WebStat",
							60:"WebTrends",
							61:"wusage",
							62:"wwwstat",
							63:"WWWStat4Mac",
							64:"Homemade",
							65:"Other",
							66:"Don't Use"
						}

def get_field(line,num):
	"""Given a line of input data from the survey, get the field for
	question number num.  num may be a list of field numbers, in which
	case, a dictionary of fields will be returned (of the form
	{field_num:field,...}."""

	if type(num) is type([]):
		res = {}
		for n in num:
			f = get_field(line,n)
			res[`num`] = f
		return res
	else:
		fields = string.split(line,"\t")
		return fields[num]

def get_stats(lines,field_num):
	"""Return a list of [yes,no,percent_yes,percent_no] for the given
field number."""

	yes = 0
	no = 0
	unknown = 0
	for line in lines[1:]:
		# Comment the next 2 lines out if you want statistics to include
		# webmasters who don't use any log analyzers.
		if get_field(line,66) == "1":
			continue
		fields = get_field(line,field_num)
		if fields == '1':
			yes = yes + 1
		elif fields == '0':
			no = no + 1

	total = yes + no
	pyes = round(100*float(yes)/total,1)
	pno = 100-pyes

	return [yes,no,pyes,pno]

# Read in the file.
f = open(infile,"r")
lines = f.readlines()
f.close()

# Get and sort the percentages.
stats = []
for n in questions.keys():
	pyes = get_stats(lines,int(n))[2]
	name = questions[n]
	stats.append( (name,pyes) )

def mycmp(a,b):
	return cmp(b[1],a[1])

stats.sort(mycmp)

for i in stats:
	print "%-20s\t%4.1f%%" % (i[0],i[1])

-------End script--------


Additional Links:

Logreport.org: Your destination for log analysis on the web
(Also the homepage of Lire.)
http://www.logreport.org/

HTTPd Log Analyzers
http://www.hypernews.org/HyperNews/get/www/log-analyzers.html?inline=1&nogifs

Getting the Most from Your Log Files
http://www.serverwatch.com/tutorials/article.php/10825_1354851_6

Traffic Analysis Software - The Web Developer's Journal
http://webdevelopersjournal.com/columns/analysis.html

The Analog homepage claims that Analog is the most popular
analyzer based on the GVU survey.  Here're their comments on the
matter:
http://www.analog.cx/survey.html


Google Search Terms:

most popular linux traffic log analysers
://www.google.com/search?q=most%20popular%20linux%20traffic%20log%20analysers&sourceid=opera&num=0

webtrends log analyzer
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=webtrends+log+analyzer&btnG=Google+Search

WebStat
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=WebStat&btnG=Google+Search


Good luck,
bikerman-ga
respree-ga rated this answer:5 out of 5 stars
Awesome! Professional and very thorough. That was much more than I was
expecting. Thanks very much for your research!  Highly recommended.

Comments  
Subject: Linux-based traffic log analyzers...
From: denco-ga on 14 Oct 2002 11:46 PDT
 
http://www.scriptsearch.com/Perl/Scripts_and_Programs/Web_Traffic_Analysis/
http://theperlarchive.com/guide/Site_Access_Statistics/index.shtml
http://theperlarchive.com/guide/Site_Access_Statistics/Server_Logs/index.shtml

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy