Hello Blackrock,
The product that is perhaps most close to what you asked for (free,
Linux / Solaris, single web page for several systems, history) is
Ganglia.
http://ganglia.sourceforge.net/
I use Ganglia to monitor about 50 systems in five clusters, but the
references on the Ganglia site show it scales well to several hundred
systems.
For background, Ganglia has three main parts:
- gmond - data collection daemon, runs on the systems to be
monitored; gmond can be set up to be "deaf" and only send data or can
act as redundant data collectors (in case a machine goes down).
- gmetad - a daemon that polls the systems to be monitored and helps
collect (using rrdtool) the information. gmetad can poll gmond systems
or gmetad systems as well - again for redundancy.
- web front end - a PHP application, I run it under Apache, to
generate the web pages
This is described in far more detail at
http://ganglia.sourceforge.net/docs/
and in the documentation you get after download.
Another alternative is the Performance Copilot
http://oss.sgi.com/projects/pcp/index.html
which announced Solaris support in February 2003. The data collection
is pretty comprehensive on Irix and Linux, but I haven't tried the
Solaris version. It does not have a web front end, but the GUI tools
on Irix (if you have a system available) are VERY nice. The pmie
(inference engine) tool is also capable of generating alarms or taking
various actions when conditions are detected. The archive / history
mechanism is pretty complete as well.
For other possibilities, I suggest searches such as:
cluster performance measurement
performance measurement solaris
If this answer does not fit the bill in ANY way or you need some more
alternatives, please make a clarification request.
--Maniac |