In our stack traces we have many threads, generally idle. This makes it hard for a new user or support to identify issues quickly.
In addition, because of our access log buffering, there is often logs in memory related to a crash that are inaccesible. We give steps to display these on the wiki, but it's not very friendly.
Instead, we should provide a python script for gdb, that allows dumping this information.
<img alt="0001-Ticket-49242-add-gdb-script-to-rpm.patch" src="/389-ds-base/issue/raw/files/0d99104f1772feeefe3cc4fdc6d2d85ee00a02bad893119d199099abc8d34aa1-0001-Ticket-49242-add-gdb-script-to-rpm.patch" />
Metadata Update from @firstyear: - Custom field reviewstatus adjusted to review - Custom field type adjusted to defect - Issue assigned to firstyear
I'm not sure we can always consider a thread in DS_Sleep as idle, it could eg be a backend op sleeping between txn retry, we should not loose these threads
Metadata Update from @mreynolds: - Issue set to the milestone: 1.3.7.0
Yeah, that makes sense. I can remove DS_Sleep from the pattern match then.
<img alt="0001-Ticket-49242-add-gdb-script-to-rpm.patch" src="/389-ds-base/issue/raw/files/cadb8ed0d1132bc6bb8dc51651fab7942de250718d6191cc19858422939e73cc-0001-Ticket-49242-add-gdb-script-to-rpm.patch" />
I'm still thinking about the best approach, I do not really like to remove threads. Couldn't we instead collapse identical stack traces and report once, saying thread xx and 22 other threads : wait_on_new_work ... thread yy and 3 other threads: dblayer_lock ... thread zz .. thread a ... ...
<img alt="0001-Ticket-49242-add-gdb-script-to-rpm.patch" src="/389-ds-base/issue/raw/files/aba0f5745e15cfb12980478626a4b9eb76846c8c5b1a21c7e4f1b175b51cb194-0001-Ticket-49242-add-gdb-script-to-rpm.patch" />
@vashirov Is /usr/lib64/dirsrv/python/ds_gdb.py okay for the rpm?
@lkrispen This update changes how it works.
In a gdb session type:
source /usr/lib64/dirsrv/python/ds_gdb.py
This script does three things. First, it changes the back trace output to show what thread might be idle as:
#0 0x00007ffff3ff981b in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x00007ffff422e630 in PR_WaitCondVar () at /lib64/libnspr4.so #2 0x0000000000424914 in [IDLE THREAD] connection_wait_for_new_work (pb=0x60800046c020, interval=4294967295) at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:966 #3 0x0000000000428380 in connection_threadmain () at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1533 #4 0x00007ffff4233fcb in _pt_root () at /lib64/libnspr4.so #5 0x00007ffff3ff3369 in start_thread () at /lib64/libpthread.so.0 #6 0x00007ffff38cbd0f in clone () at /lib64/libc.so.6
Note the [IDLE THREAD] addition in the stack.
Second, it adds a command called ds-access-log which will show the contents of the in memory access log.
Finally, it adds a command called ds-backtrace which will print a backtrace, but automatically determines threads in the same stack, so it coalesces them together. For example.
Thread 58 (LWP 3372)) Thread 57 (LWP 3371)) Thread 56 (LWP 3370)) Thread 55 (LWP 3369)) Thread 54 (LWP 3368)) Thread 53 (LWP 3367)) Thread 52 (LWP 3366)) Thread 51 (LWP 3365)) Thread 50 (LWP 3364)) Thread 49 (LWP 3363)) Thread 48 (LWP 3362)) Thread 47 (LWP 3361)) Thread 46 (LWP 3360)) Thread 45 (LWP 3359)) Thread 44 (LWP 3358)) Thread 43 (LWP 3357)) Thread 42 (LWP 3356)) Thread 41 (LWP 3355)) Thread 40 (LWP 3354)) Thread 39 (LWP 3353)) Thread 38 (LWP 3352)) Thread 37 (LWP 3351)) Thread 36 (LWP 3350)) Thread 35 (LWP 3349)) #0 0x00007ffff3ff981b in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0 #1 0x00007ffff422e630 in PR_WaitCondVar () at /lib64/libnspr4.so #2 0x0000000000424914 in [IDLE THREAD] connection_wait_for_new_work (pb=0x60800046c020, interval=4294967295) at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:966 #3 0x0000000000428380 in connection_threadmain () at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1533 #4 0x00007ffff4233fcb in _pt_root () at /lib64/libnspr4.so #5 0x00007ffff3ff3369 in start_thread () at /lib64/libpthread.so.0 #6 0x00007ffff38cbd0f in clone () at /lib64/libc.so.6
Because of how these changes were made, the backtrace and other commands still work as usual without the coalescing.
If you want it to be auto loaded while running gdb against ns-slapd, file name should be objfile-gdb.py where objfile is ns-slapd and it should be in the auto-load path for gdb (in this case /usr/share/gdb/auto-load/usr/sbin/ns-slapd-gdb.py), see https://sourceware.org/gdb/onlinedocs/gdb/objfile_002dgdbdotext-file.html#objfile_002dgdbdotext-file
objfile-gdb.py
objfile
ns-slapd
/usr/share/gdb/auto-load/usr/sbin/ns-slapd-gdb.py
Path /usr/share/gdb/auto-load belongs to many packages according to yum provides "/usr/share/gdb/auto-load/*", so I think it's pretty safe to do the same trick as other do: http://pkgs.fedoraproject.org/cgit/rpms/gcc.git/tree/gcc.spec#n1451 http://pkgs.fedoraproject.org/cgit/rpms/mono.git/tree/mono.spec#n356 http://pkgs.fedoraproject.org/cgit/rpms/libreoffice.git/tree/libreoffice.spec#n1355
/usr/share/gdb/auto-load
yum provides "/usr/share/gdb/auto-load/*"
Also I'm getting errors while I'm trying to use it
NameError: name 'DSErrorLog' is not defined
I guess you meant DSAccessLog()
DSAccessLog()
Another thing, you have
+ $(srcdir)/ldap/admin/src/scripts/ds-replcheck \
which is not related to this change.
Thanks!
@vashirov Is /usr/lib64/dirsrv/python/ds_gdb.py okay for the rpm? If you want it to be auto loaded while running gdb against ns-slapd, file name should be objfile-gdb.py where objfile is ns-slapd and it should be in the auto-load path for gdb (in this case /usr/share/gdb/auto-load/usr/sbin/ns-slapd-gdb.py), see https://sourceware.org/gdb/onlinedocs/gdb/objfile_002dgdbdotext-file.html#objfile_002dgdbdotext-file Path /usr/share/gdb/auto-load belongs to many packages according to yum provides "/usr/share/gdb/auto-load/*", so I think it's pretty safe to do the same trick as other do: http://pkgs.fedoraproject.org/cgit/rpms/gcc.git/tree/gcc.spec#n1451 http://pkgs.fedoraproject.org/cgit/rpms/mono.git/tree/mono.spec#n356 http://pkgs.fedoraproject.org/cgit/rpms/libreoffice.git/tree/libreoffice.spec#n1355
If you want it to be auto loaded while running gdb against ns-slapd, file name should be objfile-gdb.py where objfile is ns-slapd and it should be in the auto-load path for gdb (in this case /usr/share/gdb/auto-load/usr/sbin/ns-slapd-gdb.py), see https://sourceware.org/gdb/onlinedocs/gdb/objfile_002dgdbdotext-file.html#objfile_002dgdbdotext-file Path /usr/share/gdb/auto-load belongs to many packages according to yum provides "/usr/share/gdb/auto-load/*", so I think it's pretty safe to do the same trick as other do: http://pkgs.fedoraproject.org/cgit/rpms/gcc.git/tree/gcc.spec#n1451 http://pkgs.fedoraproject.org/cgit/rpms/mono.git/tree/mono.spec#n356 http://pkgs.fedoraproject.org/cgit/rpms/libreoffice.git/tree/libreoffice.spec#n1355
Mate, I could no work out where it was meant to go. That's genius. I'll try and get it to go there.
Also I'm getting errors while I'm trying to use it NameError: name 'DSErrorLog' is not defined
Opps! Why was this working for me I wonder ....
I guess you meant DSAccessLog() Another thing, you have + $(srcdir)/ldap/admin/src/scripts/ds-replcheck \ which is not related to this change. Thanks!
I guess you meant DSAccessLog() Another thing, you have + $(srcdir)/ldap/admin/src/scripts/ds-replcheck \
which is not related to this change. Thanks!
Yep, I think I was trying to solve something and forgot to take that out.
<img alt="0001-Ticket-49242-add-gdb-script-to-rpm.patch" src="/389-ds-base/issue/raw/files/5ff4e26b52b3fa5cb4df7d4e19a435097b25967c7250330ab0e671009b095d49-0001-Ticket-49242-add-gdb-script-to-rpm.patch" />
@vashirov This adds the script to the gdb autoload path as you described. It adds two commands, ds-backtrace and ds-access-log. Finally, it has a stack filter that indicates on backtraces (both ds-backtrace and bt) if a stack may be idle.
Just one minor thing: File name changed, usage needs an update:
69 +# Usage: from within gdb call: 70 +# source ds_gdb.py 71 +#
The rest looks good to me, but I'll leave the acking to developers.
I updated the usage for you :)
Metadata Update from @vashirov: - Custom field reviewstatus adjusted to ack (was: review)
commit 5ecd8ec To ssh://git@pagure.io/389-ds-base.git c2c512e..5ecd8ec master -> master
Metadata Update from @firstyear: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2301
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Log in to comment on this ticket.