#8506 Nightly failure in ipa-server-install --uninstall: org.freedesktop.DBus.Error.NoReply
Closed: fixed 2 years ago by frenaud. Opened 3 years ago by frenaud.

Issue

The nightly tests are failing on rawhide during ipa-server-install --uninstall. See PR #416, with for instance automember (report, logs):

RUN ['ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role']
RUN ['ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role']
Updating DNS system records
unable to resolve host name master.ipa.test. to IP address, ipa-ca DNS record will be incomplete
Forcing removal of master.ipa.test
Ignoring topology connectivity errors.
------------------------------------
Deleted IPA server "master.ipa.test"
------------------------------------
Shutting down all IPA services
Unconfiguring CA
org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
The ipa-server-install command failed. See /var/log/ipaserver-uninstall.log for more information
Exit code: 1

Metadata Update from @rcritten:
- Issue close_status updated to: duplicate
- Issue status updated to: Closed (was: Open)

3 years ago

Metadata Update from @frenaud:
- Issue status updated to: Open (was: Closed)

3 years ago

The issue is still happening, see for instance PR #496 with the test test_caless_TestReplicaCALessToCAFull (report, logs):

[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] RUN ['ipa-server-install', '--uninstall', '-U', '--ignore-topology-disconnect', '--ignore-last-of-role']
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Updating DNS system records
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Forcing removal of replica0.ipa.test
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Ignoring topology connectivity errors.
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Failed to cleanup replica0.ipa.test DNS entries: no matching entry found
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] You may need to manually remove them from the tree
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] --------------------------------------
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Deleted IPA server "replica0.ipa.test"
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] --------------------------------------
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Shutting down all IPA services
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Unconfiguring CA
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] org.freedesktop.DBus.Error.NoReply: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] The ipa-server-install command failed. See /var/log/ipaserver-uninstall.log for more information
[ipatests.pytest_ipa.integration.host.Host.replica0.cmd106] Exit code: 1

Similar issue observed in PR 526
with test test_integration/test_advise.py::TestAdvice::test_advice_config_server_for_smart_card_auth : logs
and
test_cert : logs

Also on master branch with selinux enabled: PR #536

In #8586, I have a duplicate of this bug. I have a VM after the failed uninstall, its currently suspended (not powered off). If anyone wants to use it to debug, let me know.

The replica install fails here while trying to start tracking a cert:

http://freeipa-org-pr-ci.s3-website.eu-central-1.amazonaws.com/jobs/73433c76-243f-11eb-afef-fa163e46012f/test_integration-test_cert.py-TestCertmongerInterruption-install/

What's strange is I see almost no certmonger output at all in the log. IPA sets it to -d2 which should be somewhat chatty. This leads me to believe that the problem is in the installer trying to make the D-Bus request to certmonger and that timing out. It doesn't seem to have gotten to certmonger at all.

Similarly the related test uninstall (separate directory) is failing to start certmonger because it cannot connect to D-Bus. The uninstaller verifies that the dbus service is active immediately prior to this.

My theory is that dbus has gone out to lunch.

Similar issue observed in [testing_master_pki] Nightly PR #535
report

This issue is happening very regularly and needs to be investigated.

The problem is now blocking https://pagure.io/freeipa/issue/8544 . The PR depends on a new DS feature that is only available in most recent update in Fedora 33.

Metadata Update from @cheimes:
- Custom field blocking adjusted to 8544
- Issue marked as blocking: #8544

3 years ago

So it reproducibly fails? I've never been able to reproduce it outside of PR-CI myself.

It reproducible fails on PR-CI...

Failure also observed: PR680, test
_cert_fix
and also test_commands and also test_forced_client_enrolment and also test_pwpolicy

master:

  • 71047f6 Remove the option stop_certmonger from stop_tracking_*

ipa-4-9:

  • 9872610 Remove the option stop_certmonger from stop_tracking_*

Metadata Update from @abbra:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Similar failure observed in PR 738 Logs

Issue still present in [testing_master_pki] Nightly PR #762
test_ipahealthcheck,
logs

Metadata Update from @rcritten:
- Custom field affects_doc adjusted to on
- Custom field knownissue adjusted to on
- Issue status updated to: Open (was: Closed)

2 years ago

Metadata Update from @rcritten:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1930038

2 years ago

Metadata Update from @rcritten:
- Issue assigned to rcritten

2 years ago

master:

  • fb58b76 Uninstall without starting the CA in cert expiration test
  • 8c93e2f Increase timeout for TestIpaHealthCheck to 5400s

ipa-4-9:

  • b70e30d Uninstall without starting the CA in cert expiration test
  • d15e577 Increase timeout for TestIpaHealthCheck to 5400s

Metadata Update from @frenaud:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Issue resurfaced in test_integration/test_ipa_cert_fix.py::TestIpaCertFix::test_missing_csr, [testing_ipa-4.9_latest_selinux] Nightly PR #965 , report

It's the same symptom but a different broken test. Tests that move around time are problematic for certmonger since it is going to aggressively try to renew certificates. It also needs to be running during uninstallation so that the certificates can be untracked. In this case it is trying to do that but IPA isn't running so its getting blocked up enough that the DBus timeout is invoked.

One fix, since we don't care at all about the certs, would be to rm -f paths.CERTMONGER_REQUESTS_DIR/* while ipa and certmonger are shut down. Then there is nothing to do.

master:

  • 46ccf00 ipatests: Remove certmonger tracking before uninstall in cert tests

ipa-4-9:

  • cc2348a ipatests: Remove certmonger tracking before uninstall in cert tests

master:

  • 52ec9cc ipatests: remove certmonger tracking before uninstall
  • e32bfd4 ipatests: Fix a call to run_command with wildcard

ipa-4-9:

  • 12785a3 ipatests: remove certmonger tracking before uninstall
  • 85b2c81 ipatests: Fix a call to run_command with wildcard

Login to comment on this ticket.

Metadata