Today's nightly haven't started because build has never completed. When I logged in on a machine, build process was waiting on the cmocka tests. test_nuncstans was hanging:
[root@server 389-ds-base-1.4.0.16.20180925git4f118f4]# strace -p 9193 strace: Process 9193 attached futex(0x7fefc78229d0, FUTEX_WAIT, 23669, NULL
pstack:
[root@server ~]# pstack 9193 Thread 3 (Thread 0x7fefc581e700 (LWP 9236)): #0 0x00007fefc860218f in epoll_wait () from target:/lib64/libc.so.6 #1 0x00007fefc9af3e39 in epoll_dispatch () from target:/lib64/libevent-2.1.so.6 #2 0x00007fefc9ae9b18 in event_base_loop () from target:/lib64/libevent-2.1.so.6 #3 0x00007fefc9d21102 in ns_event_fw_loop (ns_event_fw_ctx=<optimized out>) at src/nunc-stans/ns/ns_event_fw_event.c:308 #4 0x00007fefc9d1fc19 in event_loop_thread_func (arg=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:581 #5 0x00007fefc8f17594 in start_thread () from target:/lib64/libpthread.so.0 #6 0x00007fefc8601e6f in clone () from target:/lib64/libc.so.6 Thread 2 (Thread 0x7fefc7822700 (LWP 9232)): #0 0x00007fefc8f1d51c in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib64/libpthread.so.0 #1 0x00007fefc9d1ff74 in work_q_wait (tp=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:344 #2 worker_thread_func (arg=<optimized out>) at src/nunc-stans/ns/ns_thrpool.c:396 #3 0x00007fefc8f17594 in start_thread () from target:/lib64/libpthread.so.0 #4 0x00007fefc8601e6f in clone () from target:/lib64/libc.so.6 Thread 1 (Thread 0x7fefca344d00 (LWP 9193)): #0 0x00007fefc8f18a2d in __pthread_timedjoin_ex () from target:/lib64/libpthread.so.0 #1 0x00007fefc9d20d17 in ns_thrpool_wait (tp=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:1619 #2 0x000055bf49ff1b18 in ns_test_teardown (state=<optimized out>) at src/nunc-stans/test/test_nuncstans.c:124 #3 0x00007fefc9f2897e in cmocka_run_one_test_or_fixture () from target:/lib64/libcmocka.so.0 #4 0x00007fefc9f292c8 in _cmocka_run_group_tests () from target:/lib64/libcmocka.so.0 #5 0x000055bf49ff1767 in main () at src/nunc-stans/test/test_nuncstans.c:532
Test output:
[root@server 389-ds-base-1.4.0.16.20180925git4f118f4]# cat /var/lib/mock/fedora-28-x86_64/root/builddir/build/BUILD/389-ds-base-1.4.0.16.20180925git4f118f4/test_nuncstans.log [==========] Running 10 test(s). [ RUN ] ns_init_test ns_thrpool_new(): max threads, (4) stacksize (0), event q size (unbounded), work q size (unbounded) worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! [ OK ] ns_init_test [ RUN ] ns_set_data_test ns_thrpool_new(): max threads, (4) stacksize (0), event q size (unbounded), work q size (unbounded) worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! [ OK ] ns_set_data_test [ RUN ] ns_job_done_cb_test ns_thrpool_new(): max threads, (4) stacksize (0), event q size (unbounded), work q size (unbounded) worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! [ OK ] ns_job_done_cb_test [ RUN ] ns_job_persist_rearm_ignore_test ns_thrpool_new(): max threads, (4) stacksize (0), event q size (unbounded), work q size (unbounded) worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! worker_thread_func notified to shutdown! worker_thread_func shutdown complete! [ OK ] ns_job_persist_rearm_ignore_test [ RUN ] ns_job_persist_disarm_test
389-ds-base-1.4.0.16.20180925git4f118f4
I hit this again today during the copr build.
Metadata Update from @vashirov: - Custom field component adjusted to None - Custom field origin adjusted to None - Custom field reviewstatus adjusted to None - Custom field type adjusted to None - Custom field version adjusted to None
I'll have a look at this soon :)
Looking at this, it looks like the shutdown event message isn't being handled correctly or acked. Has some change happened to NS recently that could have affected the memory fencing?
Metadata Update from @mreynolds: - Issue set to the milestone: 1.4.1
Got another hang today in 1.4.0.20. @firstyear, did you have a chance to look at this?
When this was raised 6 months ago, I did stare at the shutdown code pretty intently hoping to see any issues, and couldn't find one. I'm curious about the hardware of the machine that caused this issue? Can you send me it's lscpu?
Metadata Update from @vashirov: - Issue close_status updated to: wontfix - Issue status updated to: Closed (was: Open)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/3021
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Login to comment on this ticket.