#47387 improve logconv.pl performance with large access logs
Closed: wontfix None Opened 10 years ago by rmeggins.

Analysis of large access logs needs to be much faster. Some areas for improvement:

  • use db files for temp files

Specifically, use tied hashes, where the hashes are tied to database files, using the perl DB_File interface.
- for simple arrays, use DB_RECNO
- for hashes where order is not important, use DB_HASH
- for hashes where order is important, use DB_BTREE
for example:

my %h1;
tie %h1, "DB_File", "$dbdir/h1.db", O_CREAT|O_RDWR, 0666, $DB_BTREE;
$h1{'e'} = 5;
$h1{'d'} = 4;
$h1{'c'} = 3;
$h1{'b'} = 2;
$h1{'a'} = 1;
while (my($k,$v) = each %h1) {
    print "$k = $v\n";
}

this prints

a = 1
b = 2
c = 3
d = 4
e = 5
  • not sure what else - perhaps optimize regular expressions?

Looks good, but one indentation issue:

    1009                                    printf "%10s - Connections\n", $ip_hash->{ $key }; 
1010                    my %counts; 
1011                    map { $counts{$_} = $hashes->{$_}->{$key} if (defined($hashes->{$_}->{$key})) } @conncodes; 
1012                                    foreach my $code (sort { $counts{$b} <=> $counts{$a} } keys %counts) {

New patch attached - fixes formatting issues - also fixes some other warnings found during testing with -m.

commit 0d97d63
Author: Rich Megginson rmeggins@redhat.com
Date: Mon Jun 10 20:04:20 2013 -0600

To ssh://git.fedorahosted.org/git/389/ds.git
989a30b..8a23f5e master -> master
commit 8a23f5e
Author: Rich Megginson rmeggins@redhat.com
Date: Tue Jul 2 14:38:49 2013 -0600

To ssh://git.fedorahosted.org/git/389/ds.git
fbece32..313dd8e 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 0163575
Author: Rich Megginson rmeggins@redhat.com
Date: Tue Jul 2 14:38:49 2013 -0600
commit b785fc2
Author: Rich Megginson rmeggins@redhat.com
Date: Mon Jun 10 20:04:20 2013 -0600

To ssh://git.fedorahosted.org/git/389/ds.git
c96eaa0..9103b3e 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
commit 3446810
Author: Rich Megginson rmeggins@redhat.com
Date: Tue Jul 2 14:38:49 2013 -0600
commit 481f2e3
Author: Rich Megginson rmeggins@redhat.com
Date: Mon Jun 10 20:04:20 2013 -0600

To ssh://git.fedorahosted.org/git/389/ds.git
ff9a292..e38685b 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit 7a107bd
Author: Rich Megginson rmeggins@redhat.com
Date: Tue Jul 2 14:38:49 2013 -0600
commit 8e35cc8
Author: Rich Megginson rmeggins@redhat.com
Date: Mon Jun 10 20:04:20 2013 -0600

Metadata Update from @rmeggins:
- Issue assigned to rmeggins
- Issue set to the milestone: 1.3.2 - 06/13 (June)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/724

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata