#49962 test_nuncstans hangs during the build in ns_job_persist_disarm_test
Closed: wontfix 4 years ago by vashirov. Opened 5 years ago by vashirov.

Issue Description

Today's nightly haven't started because build has never completed. When I logged in on a machine, build process was waiting on the cmocka tests. test_nuncstans was hanging:

[root@server 389-ds-base-1.4.0.16.20180925git4f118f4]# strace -p 9193
strace: Process 9193 attached                                                                                   
futex(0x7fefc78229d0, FUTEX_WAIT, 23669, NULL

pstack:

[root@server ~]# pstack 9193            
Thread 3 (Thread 0x7fefc581e700 (LWP 9236)):
#0  0x00007fefc860218f in epoll_wait () from target:/lib64/libc.so.6
#1  0x00007fefc9af3e39 in epoll_dispatch () from target:/lib64/libevent-2.1.so.6
#2  0x00007fefc9ae9b18 in event_base_loop () from target:/lib64/libevent-2.1.so.6
#3  0x00007fefc9d21102 in ns_event_fw_loop (ns_event_fw_ctx=<optimized out>) at src/nunc-stans/ns/ns_event_fw_event.c:308
#4  0x00007fefc9d1fc19 in event_loop_thread_func (arg=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:581
#5  0x00007fefc8f17594 in start_thread () from target:/lib64/libpthread.so.0
#6  0x00007fefc8601e6f in clone () from target:/lib64/libc.so.6
Thread 2 (Thread 0x7fefc7822700 (LWP 9232)):                    
#0  0x00007fefc8f1d51c in pthread_cond_wait@@GLIBC_2.3.2 () from target:/lib64/libpthread.so.0
#1  0x00007fefc9d1ff74 in work_q_wait (tp=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:344
#2  worker_thread_func (arg=<optimized out>) at src/nunc-stans/ns/ns_thrpool.c:396
#3  0x00007fefc8f17594 in start_thread () from target:/lib64/libpthread.so.0
#4  0x00007fefc8601e6f in clone () from target:/lib64/libc.so.6
Thread 1 (Thread 0x7fefca344d00 (LWP 9193)):
#0  0x00007fefc8f18a2d in __pthread_timedjoin_ex () from target:/lib64/libpthread.so.0
#1  0x00007fefc9d20d17 in ns_thrpool_wait (tp=0x55bf4ad75f40) at src/nunc-stans/ns/ns_thrpool.c:1619
#2  0x000055bf49ff1b18 in ns_test_teardown (state=<optimized out>) at src/nunc-stans/test/test_nuncstans.c:124
#3  0x00007fefc9f2897e in cmocka_run_one_test_or_fixture () from target:/lib64/libcmocka.so.0
#4  0x00007fefc9f292c8 in _cmocka_run_group_tests () from target:/lib64/libcmocka.so.0
#5  0x000055bf49ff1767 in main () at src/nunc-stans/test/test_nuncstans.c:532

Test output:

[root@server 389-ds-base-1.4.0.16.20180925git4f118f4]# cat /var/lib/mock/fedora-28-x86_64/root/builddir/build/BUILD/389-ds-base-1.4.0.16.20180925git4f118f4/test_nuncstans.log                                                                                                 
[==========] Running 10 test(s).
[ RUN      ] ns_init_test                                               
ns_thrpool_new():  max threads, (4)
stacksize (0), event q size (unbounded), work q size (unbounded)                      
worker_thread_func notified to shutdown!
worker_thread_func shutdown complete!                                                 
worker_thread_func notified to shutdown!                                                            
worker_thread_func shutdown complete!                                                                         
worker_thread_func notified to shutdown!                                                     
worker_thread_func shutdown complete!                                                 
worker_thread_func notified to shutdown!                                     
worker_thread_func shutdown complete!
[       OK ] ns_init_test     
[ RUN      ] ns_set_data_test
ns_thrpool_new():  max threads, (4)                
stacksize (0), event q size (unbounded), work q size (unbounded)
worker_thread_func notified to shutdown!
worker_thread_func shutdown complete!                                                                                                                                                                                                                                          
worker_thread_func notified to shutdown!
worker_thread_func shutdown complete!       
worker_thread_func notified to shutdown!                            
worker_thread_func shutdown complete!                                           
worker_thread_func notified to shutdown!                                         
worker_thread_func shutdown complete!                                                                                    
[       OK ] ns_set_data_test                                                                              
[ RUN      ] ns_job_done_cb_test                                            
ns_thrpool_new():  max threads, (4)                            
stacksize (0), event q size (unbounded), work q size (unbounded)
worker_thread_func notified to shutdown!                                                      
worker_thread_func shutdown complete!                                                          
worker_thread_func notified to shutdown!                                          
worker_thread_func shutdown complete!                                       
worker_thread_func notified to shutdown!                       
worker_thread_func shutdown complete!       
worker_thread_func notified to shutdown!                                              
worker_thread_func shutdown complete!                                                               
[       OK ] ns_job_done_cb_test                                                                              
[ RUN      ] ns_job_persist_rearm_ignore_test                                                
ns_thrpool_new():  max threads, (4)                                                   
stacksize (0), event q size (unbounded), work q size (unbounded)             
worker_thread_func notified to shutdown!
worker_thread_func shutdown complete!                                                                                      
worker_thread_func notified to shutdown!                                                                                                                                                                                                                                       
worker_thread_func shutdown complete!                                                                                                                                                                                                                                          
worker_thread_func notified to shutdown!                                                                                                                                                                                                                                       
worker_thread_func shutdown complete!                                                                                                  
worker_thread_func notified to shutdown!                                                                                                                                                                                                                                       
worker_thread_func shutdown complete!
[       OK ] ns_job_persist_rearm_ignore_test
[ RUN      ] ns_job_persist_disarm_test                               

Package Version and Platform

389-ds-base-1.4.0.16.20180925git4f118f4


I hit this again today during the copr build.

Metadata Update from @vashirov:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None

5 years ago

I'll have a look at this soon :)

Looking at this, it looks like the shutdown event message isn't being handled correctly or acked. Has some change happened to NS recently that could have affected the memory fencing?

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.1

5 years ago

Got another hang today in 1.4.0.20. @firstyear, did you have a chance to look at this?

When this was raised 6 months ago, I did stare at the shutdown code pretty intently hoping to see any issues, and couldn't find one. I'm curious about the hardware of the machine that caused this issue? Can you send me it's lscpu?

Metadata Update from @vashirov:
- Issue close_status updated to: wontfix
- Issue status updated to: Closed (was: Open)

4 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/3021

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata