I have few suggestions to make squidanalyzer better and little faster.
Please check them and implement if required.
Thanks.
- Makefile.PL should user --skip-alias for which
i.e.
my $zcat = which --skip-alias zcat
;
my $bzcat = which --skip-alias bzcat
;
- squid-analyzer should ignore HUP signal just before new instance is created.
Please add:
$SIG{'HUP'} = 'IGNORE';
before:
my $sa = new SquidAnalyzer($configfile, $logfile, $debug);
This allows to run (long) process even if your terminal ends or disconnects. (otherwise HUP signal kills the process)
- A patch is made which does few minor things (SquidAnalyzer.pm) as follows:
Link to patch:
https://gist.github.com/ammdispose/5459542
a) Pre-increment (++$i) is faster than post-increment ($i++). Since program processes lots of things recursively (possibly lakhs of increments), it is suggested to use pre-increment
b) At some places variables passed to functions are quoted, which is not necessary for e.g $self->_save_data("$self->{last_year}", "$self->{last_month}", "$self->{last_day}");
should be
$self->_save_data($self->{last_year}, $self->{last_month}, $self->{last_day});
$self->_save_data("$1", "$2");
should be
$self->_save_data($1, $2);
etc.
c) Some print statements were outputting to STDOUT instead of STDERR which is fixed in patch
d) Use of WebUrl is eliminated and patch uses adds DirLevel counter to decide relative URLs.(For Javascripts/CSS/images)
This allows to download the squidanalyzer directory and view later in offline mode without breaking links. With WebUrl, offline viewing was not possible.
It also allows to migrate reports from one server to another (even if your IP/BaseURL changes) without breaking links
e) Sacrifice LANG, locale etc and date/iconv programs
In _print_header the date/iconv were running 300-600 processes for high number of users (or may be much more).
I think this was done just for one print statement in report. And this was making program much much slow. If we just use strftime, this generates HTML pages almost 4-5 times faster. So I think thats a good sacrifice to make.
f) localtime is expensive function, should try to call it as little time as possible
g) backticks consume more resource than system. Internally in perl, backticks needs to create pipe, read output and store output in memory. So system should be used if you dont want to read output of the program.
Hope this makes sense.