Issue #49871: Nunc-stans: event thread yield when closing a connection that is busy - 389-ds-base

389-ds-base

#49871 Nunc-stans: event thread yield when closing a connection that is busy

Closed: wontfix 5 years ago Opened 5 years ago by tbordaz.

Issue Description

If a connection needs to be closed, we schedule ns_handle_closure. The problem is if the connection is still on the active list (refcnt!=0), event thread yield.

I guess it was done like this as in such case we schedule again ns_handle_closure. So event thread will be called immediately. I guess it was done to prevent event thread to loop and consum CPU.

The side effect, is that event thread can not process immediately an other event on an other connection.

I think we need to reevaluate the need of the yield in the event thread.

PS:
I think that if we fail to close a connection (because busy) we can also schedule it with some kind of delay

Package Version and Platform

Observed in1.3.7.5-24. I do not know if it could be a consequence of Bug 1597530 - Async operations can hang when the server is running nunc-stans

Steps to reproduce

I just notice this kind of pstack on many customer cases

Actual results

event thread may yield

Expected results

Event thread should not yield and process event immediately

tbordaz commented 5 years ago

#0  0x00007f54e55cf9d7 in sched_yield () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f54e659e50d in PR_Sleep (ticks=0) at ../../../nspr/pr/src/pthreads/ptthread.c:787
#2  0x00007f54e8a7cc89 in work_job_execute (job=0x55682d10ed20) at src/nunc-stans/ns/ns_thrpool.c:291
#3  0x00007f54e8a7dbe3 in event_cb (fd=<optimized out>, event=<optimized out>, arg=<optimized out>)
    at src/nunc-stans/ns/ns_event_fw_event.c:118
#4  0x00007f54e5acda14 in event_process_active_single_queue (activeq=0x556816474ff0, base=0x5568162c6c80) at event.c:1350
#5  0x00007f54e5acda14 in event_process_active (base=<optimized out>) at event.c:1420
#6  0x00007f54e5acda14 in event_base_loop (base=0x5568162c6c80, flags=flags@entry=1) at event.c:1621
#7  0x00007f54e8a7deae in ns_event_fw_loop (ns_event_fw_ctx=<optimized out>) at src/nunc-stans/ns/ns_event_fw_event.c:308
#8  0x00007f54e8a7cac9 in event_loop_thread_func (arg=0x55681628ba40) at src/nunc-stans/ns/ns_thrpool.c:581
#9  0x00007f54e5f3ddd5 in start_thread (arg=0x7f54cb094700) at pthread_create.c:308
#10 0x00007f54e55eab3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb) frame 2
(gdb) print *job
  func = 0x556814ee4190 <ns_handle_closure>, data = 0x556817ddfd00, job_type = 16, fd = 0x0, tv = {tv_sec = 0, tv_usec = 0}, 
  signal = 0, ns_event_fw_fd = 0x0, ns_event_fw_time = 0x55682016fe60, ns_event_fw_sig = 0x0, output_job_type = 16, 
  state = NS_JOB_NEEDS_DELETE, ns_event_fw_ctx = 0x5568162c6c80, alloc_event_context = 0x7f54e8a7c600 <alloc_event_context>,

static void
ns_handle_closure(struct ns_job_t *job)
{
...
    do_yield = ns_handle_closure_nomutex(c);
...
    if (do_yield) {
        /* closure not done - another reference still outstanding */
        /* yield thread after unlocking conn mutex */
        PR_Sleep(PR_INTERVAL_NO_WAIT); /* yield to allow other thread to release conn */
    }
    return;
}

Metadata Update from @tbordaz:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None

5 years ago

tbordaz commented 5 years ago

The event thread can be looping and consuming 100% CPU. It consums CPU although it always appear in sleep, because each time it is processing a different job.
Likely the job failing to remove the connection from the active list, dispatch a new ns_handle_closure job, that will be dispatched immediately

The proof is that doing several pstacks shows a different job.

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.0

5 years ago

Metadata Update from @tbordaz:
- Issue assigned to tbordaz

5 years ago

Metadata Update from @tbordaz:
- Assignee reset
- Issue close_status updated to: duplicate
- Issue status updated to: Closed (was: Open)

5 years ago

tbordaz commented 5 years ago

Duplicate of https://pagure.io/389-ds-base/issue/49815

Metadata Update from @tbordaz:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1605554

5 years ago

spichugi commented 3 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2930

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: duplicate)

3 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Priority

None

Milestone

1.4.0

reviewstatus

None

rhbz

https://bugzilla.redhat.com/show_bug.cgi?id=1605554

origin

None

389-ds-base

Source Code

#49871 Nunc-stans: event thread yield when closing a connection that is busy Closed: wontfix 5 years ago Opened 5 years ago by tbordaz.

Issue Description

Package Version and Platform

Steps to reproduce

Actual results

Expected results

Metadata

#49871 Nunc-stans: event thread yield when closing a connection that is busy

Closed: wontfix 5 years ago Opened 5 years ago by tbordaz.