Instead of using usually bizarre RRDs to see load spikes on customer machines and the likes, and scraping through piles of logs that show nothing, here's a small script that I quickly wrote that is used in a production environment to log excessive load (mostly ps data for tracking down runaway php processes and other silly things)
Requires:
cpan install Unix::Uptime
Requires:
cpan install Unix::Uptime
Code:
#!/usr/bin/env perl
# Basic Load watcher script
# Writes 'ps aux' into a logfile, useful for high-load-that-you-cannot-catch
# Requires Unix::Uptime perl module
#
# Written by: Gary Stanley <gary at DragonFlyBSD dot org>
use strict;
use warnings;
use Unix::Uptime;
use POSIX qw(strftime);
my $time = strftime("%a, %d %b %Y %H:%M:%S %z", localtime(time())) . "\n";
my $login = (getpwuid $>);
die "Unable to run: UID of 0 is required, bailing out!\n" if $login ne 'root';
my ( $l1, $l5, $l15 ) = Unix::Uptime->load();
my $warnlevel = "2.00"; # Warn when load is = or + this value
my $logfile = "loadthingylogger.log"; # the default
my $sleeptime = "30"; # 30 seconds
# Begin sleep process
while(1) {
if ($l5 >= $warnlevel) {
print "Threshold Load Average Reached for $warnlevel:\n";
print "1 Minute: $l1 5 Minute: $l5 15 Minute: $l15\n";
open my $fh, '>>', $logfile or die $!;
print $fh "-------------- MARKER @ $time LOAD: $l5 ------------- \n";
print $fh
print $fh qx(/bin/ps auwxf);
close $fh;
}
print "zZzZ for $sleeptime : Current load: $l1 $l5 $l15 , Warning on $l5 >= $warnlevel\n";
sleep($sleeptime);
}