#9123 Random nightly test failure in test_ipahealthcheck.py::TestIpaHealthCheck::test_ipa_healthcheck_expiring
Closed: fixed 2 years ago by frenaud. Opened 2 years ago by frenaud.

The nightly test test_ipahealthcheck.py::TestIpaHealthCheck::test_ipa_healthcheck_expiring is randomly failing in its cleanup phase, see for instance the following logs and report:

        finally:
            # Uninstall the master here so that the certs don't try
            # to renew after the CA is running again.
>           tasks.uninstall_master(self.master)
E           subprocess.CalledProcessError: Command '['ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role']' returned non-zero exit status 1.

pytest_ipa/integration/host.py:202: CalledProcessError
----------------------------- Captured stderr call -----------------------------
ipa: ERROR: stderr: Failed to delete master: Operations error: 
org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
The ipa-server-install command failed. See /var/log/ipaserver-uninstall.log for more information

The test is moving the date in the future, then checks that certmonger correctly warns about certificates about to expire, and uninstalls the server.
The issue is similar to https://pagure.io/freeipa/issue/8506 and can be fixed with the same strategy:
(quote from @rcritten 's commit message for #8506 fix):

There is some contention between certmonger starting during the
uninstallation process in order to stop the tracking and activity
going on within certmonger helpers.

As near as I can tell certmonger is not running, then IPA is
stopped in order to uninstall, then certmonger is started to stop
the tracking. certmonger checks cert status on startup but since
IPA isn't running it can't get a host ticket. During this time any
request over DBus may time out, causing a test to fail when we're
just trying to clean up.

The proposed fix doesn't work: if the uninstall is able to go through the untracking of certs, it fails a bit later while restoring the original IPA CA helper (in httpinstance.py#_539). I'm wondering if the test shouldn't simply restart pki and wait for the renewal to complete.

master:

  • 52ec9cc ipatests: remove certmonger tracking before uninstall
  • e32bfd4 ipatests: Fix a call to run_command with wildcard

ipa-4-9:

  • 12785a3 ipatests: remove certmonger tracking before uninstall
  • 85b2c81 ipatests: Fix a call to run_command with wildcard

Metadata Update from @frenaud:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Metadata Update from @frenaud:
- Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/6202

2 years ago

Login to comment on this ticket.

Metadata