testing_master_latest Nightly PR #305 failed in test_cert. logs
def wait_for_request(host, request_id, timeout=120): for _i in range(0, timeout, 5): result = host.run_command( "getcert list -i %s | grep status: | awk '{ print $2 }'" % request_id ) state = result.stdout_text.strip() logger.info("certmonger request is in state %s", state) if state in ('CA_REJECTED', 'CA_UNREACHABLE', 'CA_UNCONFIGURED', 'NEED_GUIDANCE', 'NEED_CA', 'MONITORING'): break time.sleep(5) else: > raise RuntimeError("request timed out") E RuntimeError: request timed out
INFO ipatests.pytest_ipa.integration.host.Host.master.IPAOpenSSHTransport:transport.py:391 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd128:transport.py:513 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd128:transport.py:557 SUBMITTING DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd128:transport.py:217 Exit code: 0 INFO ipatests.pytest_ipa.integration.tasks:tasks.py:2233 certmonger request is in state SUBMITTING INFO ipatests.pytest_ipa.integration.host.Host.master.IPAOpenSSHTransport:transport.py:391 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd129:transport.py:513 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd129:transport.py:557 SUBMITTING DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd129:transport.py:217 Exit code: 0 INFO ipatests.pytest_ipa.integration.tasks:tasks.py:2233 certmonger request is in state SUBMITTING INFO ipatests.pytest_ipa.integration.host.Host.master.IPAOpenSSHTransport:transport.py:391 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd130:transport.py:513 RUN getcert list -i test-request | grep status: | awk '{ print $2 }' DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd130:transport.py:557 SUBMITTING DEBUG ipatests.pytest_ipa.integration.host.Host.master.cmd130:transport.py:217 Exit code: 0 INFO ipatests.pytest_ipa.integration.tasks:tasks.py:2233 certmonger request is in state SUBMITTING
Also happened in [testing_master_latest] in PR #317, see logs.
The test creates the subCA [Thu Jul 30 01:05:38.750262 2020] [wsgi:error] [pid 23852:tid 140053211711232] [remote 192.168.122.104:46110] ipa: INFO: [jsonserver_session] admin@IPA.TEST: caacl_add_ca/1('hosts_services_caIPAserviceCert', version='2.239', ca=('mysubca',)): SUCCESS
then certmonger makes the request: Jul 30 01:05:39 master.ipa.test certmonger[24999]: 2020-07-30 01:05:39 [24999] Running enrollment helper "/usr/libexec/certmonger/ipa-server-guard".
and IPA never gets the request.
This suggest that ipa-server-guard couldn't get the lock.
This is also reported with each getcert execution, not sure if it is related: Jul 30 01:05:44 master.ipa.test getcert[25013]: usr/lib/api/apiutil.c Could not open /run/lock/opencryptoki/LCK..APIlock
Metadata Update from @rcritten: - Issue assigned to rcritten
@frenaud did some investigation in this, or another related ticket, and determined that indeed the lock is getting hung up because certmonger is in some cases issuing a KILL to make child processes go away!?
The way to duplicate this behavior is:
while [ true ]; do getcert modify-ca -c IPA -e "/usr/libexec/certmonger/ipa-server-guard /usr/libexec/certmonger/ipa-submit"; done
If you start certmonger with a strace -e signal you'll eventually see a ton of KILLs.
Interestingly in certmonger there is a proper waidpid() immediately after the SIGKILL call. The side effect of removing the kill is that operations are actually done more serially because of the lock, so therefore much slower. certmonger runs in tight loop so any one call can affect others. Try running the above while loop then try getcert list to see what I mean.
getcert list
I'm going to propose some changes to ipa-server-guard that will reduce its execution time by half for unsupported operations and by about 25% otherwise by removing unnecessary imports (six, print_statement) and moving others to be JIT (certs and traceback).
master:
ipa-4-8:
Metadata Update from @abbra: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.