#47428 Memory leak in 389-ds-base 1.2.11.15
Closed: wontfix None Opened 10 years ago by via.

We are experiencing a memory leak when 389 (compiled with openldap support) runs for a very long period of time with many searches and binds. Running valgrind (two valgrind traces attached), I see the leak appears in liblber. Specifically, connection_table_new calls ber_sockbuf_alloc, but each new connection also calls ber_sockbuf_add_io which appears to malloc memory. ber_sockbuf_remove_io is never called, and when 389 shuts down it calls ber_sockbuf_free. I hackily added a ber_sockbuf_remove_io call to connection_cleanup in connection.c and the memory leak no longer shows. The attached diff is meant to illustrate the change, definitely not the best way to fix the problem.

3266.log - a run with about 8k searches/binds
3479.log - a run with about 16k searches/binds

Note the:
==3479== 683,920 bytes in 17,098 blocks are still reachable in loss record 3,066 of 3,069
==3479== at 0x4C279EE: malloc (vg_replace_malloc.c:270)
==3479== by 0x5EDBFCB: ber_memalloc_x (in /lib64/liblber-2.4.so.2.5.6)
==3479== by 0x5EDE7C1: ber_sockbuf_add_io (in /lib64/liblber-2.4.so.2.5.6)
==3479== by 0x418A68: slapd_daemon (daemon.c:2636)
==3479== by 0x41F16E: main (main.c:1253)

4904.log - run of about 16k connections, no longer containing the above trace
sockbuf.patch - diff adding the remove_io call.


It's not a memory leak in the sense of "memory is allocated and never freed" - instead, what happens is that as Sockbufs are reused, each one gets pushed to the top of the stack of previous sockbufs - if you run gdb and put a break point in ber_sockbuf_add_io you will see that the linked list sb->sb_iod->sbiod_next gets longer and longer. Eventually, at shutdown time, all of these are freed, which is why we never see this in "regular" valgrind runs.

Looking at the api - although calling ber_sockbuf_remove_io will work in this case, the "proper" thing to do is to call ber_sockbuf_free() when the connection closes, and call ber_sockbuf_alloc to allocate a brand new Sockbuf. ber_sockbuf_free will call ber_int_sb_close to close all of the sockbuf layers, then call ber_int_sb_destroy to destroy all the sockbuf layers. However, in 389 code, we only ever have 1 layer - we use PRFileDesc for TLS/SSL and SASL layers - so it should be safe to use ber_sockbuf_remove_io

Yeah, the list growing longer and longer probably isn't a problem for most people, but we have multiple thousand binds per second and over weeks it will eventually be killed by the OOM killer. If you think the right way to deal with this is to free upon connection close and re-alloc on a new one, I can have a patch for that tomorrow. I just figured due to the way that 389 alloc's all of the sockbuf's upon initialization that it was ideal to add/remove_io on connection new/closed. Would you recommend changing connection_cleanup to free it and then the new connection re-alloc if necessary?

Replying to [comment:2 via]:

Yeah, the list growing longer and longer probably isn't a problem for most people, but we have multiple thousand binds per second and over weeks it will eventually be killed by the OOM killer. If you think the right way to deal with this is to free upon connection close and re-alloc on a new one, I can have a patch for that tomorrow. I just figured due to the way that 389 alloc's all of the sockbuf's upon initialization that it was ideal to add/remove_io on connection new/closed. Would you recommend changing connection_cleanup to free it and then the new connection re-alloc if necessary?

No, I think the way you implemented it is fine - would rather use ber_sockbuf_remove_io - I've tested the patch with gdb and it looks good.

I've cleaned up the patch slightly.

What name/email address do you want me to use for the patch attribution?

Matthew Via (matthew.via@mailtrust.com).

I was going to be sending a somewhat cleaner patch that would allow the struct to remain static (and use a helper function) -- if you still want that I can have it ready in a few minutes. If you're fine with your cleaned up version thats also fine.

Eliminate the need to make the local static structure globally accessible
sockbuf.2.patch

Replying to [comment:7 via]:

Matthew Via (matthew.via@mailtrust.com).

I was going to be sending a somewhat cleaner patch that would allow the struct to remain static (and use a helper function) -- if you still want that I can have it ready in a few minutes. If you're fine with your cleaned up version thats also fine.

I like your patch better - ack - do you want the Author in the git commit to be Matthew Via matthew.via@mailtrust.com ?

Yeah, thats fine. Thank you.

42abece..6e0a8ef 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 6e0a8ef
Author: Matthew Via matthew.via@mailtrust.com
Date: Wed Jul 10 11:30:57 2013 -0600
3108793..b18ee0b 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
commit b18ee0b
Author: Matthew Via matthew.via@mailtrust.com
Date: Wed Jul 10 11:30:57 2013 -0600
ff7fb87..ed26da0 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit ed26da0
Author: Matthew Via matthew.via@mailtrust.com
Date: Wed Jul 10 11:30:57 2013 -0600
dace0f1..df93b03 master -> master
commit df93b03
Author: Matthew Via matthew.via@mailtrust.com
Date: Wed Jul 10 11:30:57 2013 -0600

Metadata Update from @via:
- Issue assigned to rmeggins
- Issue set to the milestone: 1.2.11.22

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/765

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata