#7924 `_CrossProcessLock` is not released on owner kill
Opened a year ago by slev. Modified a year ago

_CrossProcessLock doesn't check the owner status. So, if one kills the lock owner process all the other processes acquiring the lock will wait for 1h (the default lock duration) or manual deletion of the lock file.
Consider the next synthetic example.

cat test_lock.py

from ipaserver.install import certs
import time

with certs.renewal_lock:
    while True:
systemctl stop certmonger

./test_lock.py &
[2] 15308

cat /var/run/ipa/renewal.lock 
locked = 1
owner = test_lock.py[15308]
expire = 20190424220838587266

kill -9 $!

./test_lock.py &
[2] 15332

cat /var/run/ipa/renewal.lock 
locked = 1
owner = test_lock.py[15308]
expire = 20190424220838587266

So, the second and next processes will wait for acquiring the lock for 1h.
The real life example:

sed -i '/with certs\.renewal_lock:/a\        import time;time.sleep(5)' /usr/libexec/certmonger/ipa-server-guard

reqid="$(getcert list -c 'IPA' | grep -m1 'Request ID' | cut -d\' -f2)"
getcert resubmit -i "$reqid" && sleep 4 ; systemctl restart certmonger
getcert list -i "$reqid"

Here is the similar. The status of request will be 'SUBMITTING' for 1h.
This mainly happens on upgrade.

I've checked this problem on FreeIPA 4.3.3 and 4.7.2.

The problem is that admin/user see the status of the request ('SUBMITTING') and guess that something goes wrong. It's abnormal to wait for something or reboot machine.

In order to investigate, could you post more information:
- which certificate is stuck in SUBMITTING (a renewal can take a few minutes, especially if Dogtag system certificates all need to be renewed because it requires a restart of Dogtag for each renewed cert)
- who is the lock owner at that time
- what step of the upgrade is causing this situation? /var/log/ipaupgrade.log should display the steps already done

Please take a look at the latter reproducer.

@frenaud, am I wrong?
I understand that severity is very low but it could happen and actually happens.
If this works as expected then at least give out some info into the log.

Metadata Update from @pcech:
- Issue tagged with: Falcon

a year ago

Login to comment on this ticket.