A few days ago a colleague asked me about using Perl for analysing some ASA firewall logs in order to spot how many public addresses are needed for NATting users towards the Internet. The basic regular expression to capture the bits of information that he needs is quite straightforward, but what was interesting is that the files he has to work on are gzipped, and he had already extracted a sample one to work on. I remembered that there is IO::Zlib and this is what I did:
for my $file (@ARGV) {
eval {
my $fh = _open($file);
while (<$fh>) {
my ($inside, $outside) = /Built\ dynamic\ translation\ from\ inside:(.*?)\ to\ outside:(.*?)/mxs
or next;
# use $inside and $outside
}
close $fh;
} or warn "exception for '$file': $EVAL_ERROR";
}
sub _open {
my ($file) = @_;
my $fh;
if ($file =~ /\.gz \z/mxs) {
$fh = IO::Zlib->new();
$fh->open($file, 'rb')
or die "IO::Zlib complained: $OS_ERROR";
}
else {
open $fh, '<', $file
or die "open(): $OS_ERROR";
}
return $fh;
}
It worked pretty well so nothing to complain. Just before blogging about it, I
paid a due visit to the documentation, and I discovered that I was more or less
lucky: there are limitations in using the module, which basically boil down to
$fh
not being what you expect from a full fledged filehandle. But,
at least, it should work out of the box if all you need is to read the file
one line at a time.
The module isn’t in the core distribution, but it’s a common prerequisite so chances are that you already have it in your distro. It’s a bit weird that it is known by corelist to have been included in 5.9.3:
IO::Zlib was first released with perl 5.009003
even though there is no trace of it in 5.10. Go figure. Anyway, it
should be a bit more common to find than the alternative
PerlIO::gzip, which would make
the sub _open
unneeded when substituted with this:
open my $fh, '<:autopop', $file or die '...';
I wonder how much Perl IO layers are used out there.