Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1515759
Original description
Version-Release number of selected component (if applicable): ipa-server-4.5.0-22.el7_4.x86_64
How reproducible: always
Steps to Reproduce: 1. ipa-server-install --setup-dns --forwarder=FORWARDER --ip-address=12.13.14.15 -r TESTRELM.TEST -p Secret123 -a Secret123 -U --allow-zone-overlap
ipa-server-install --setup-dns --forwarder=FORWARDER --ip-address=12.13.14.15 -r TESTRELM.TEST -p Secret123 -a Secret123 -U --allow-zone-overlap
NOTE: 12.13.14.15 ip which is not present on the system
Actual results:
[1/29]: configuring certificate server instance ipa.ipaserver.install.cainstance.CAInstance: CRITICAL Failed to configure CA instance: Command '/usr/sbin/pkispawn -s CA -f /tmp/tmpn6lClp' returned non-zero exit status 1 ipa.ipaserver.install.cainstance.CAInstance: CRITICAL See the installation logs and the following files/directories for more information: ipa.ipaserver.install.cainstance.CAInstance: CRITICAL /var/log/pki/pki-tomcat [error] RuntimeError: CA configuration failed. ipa.ipapython.install.cli.install_tool(CompatServerMasterInstall): ERROR CA configuration failed. ipa.ipapython.install.cli.install_tool(CompatServerMasterInstall): ERROR The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information
Initial reproduction notes and analysis by @frenaud:
Issue reproducible.
In order to reproduce, you need to add the machine IP address to /etc/hosts (the existing one, not the fake one).
In this case, pki-spawn fails with:
2017-11-23 09:37:48 pkispawn : INFO ....... executing 'systemctl daemon-reload' 2017-11-23 09:37:48 pkispawn : INFO ....... executing 'systemctl start pki-tomcatd@pki-tomcat.service' 2017-11-23 09:37:49 pkispawn : DEBUG ........... pki_protocol https hostname vm-110.abc.idm.lab.eng.brq.redhat.com port 8443 subsystem ca 2017-11-23 09:39:56 pkispawn : DEBUG ........... No connection - server may still be down 2017-11-23 09:39:56 pkispawn : DEBUG ........... No connection - exception thrown: ('Connection aborted.', error(110, 'Connection timed out')) 2017-11-23 09:39:57 pkispawn : ERROR ....... server failed to restart 2017-11-23 09:39:57 pkispawn : DEBUG ....... Error Type: Exception 2017-11-23 09:39:57 pkispawn : DEBUG ....... Error Message: server failed to restart 2017-11-23 09:39:57 pkispawn : DEBUG ....... File "/usr/sbin/pkispawn", line 533, in main scriptlet.spawn(deployer) File "/usr/lib/python2.7/site-packages/pki/server/deployment/scriptlets/configuration.py", line 374, in spawn raise Exception("server failed to restart")
The code shows that pki spawn is checking if the server is running (https://github.com/dogtagpki/pki/blob/DOGTAG_10_4_BRANCH/base/server/python/pki/server/deployment/pkihelper.py#L1020) by connecting to the url https://hostname:8443/ca/admin/ca/getStatus. Note that there is only one connection try.
hostname
Wireshark demonstrates that the fake IP address is used.
I noticed that if the timeout waiting for the server to come up is raised to 200s for instance, pki spawn finishes successfully (https://github.com/dogtagpki/pki/blob/DOGTAG_10_4_BRANCH/base/server/python/pki/server/deployment/scriptlets/configuration.py#L369).
So there are probably 2 timeouts that interact here: - the timeout set in deployer.instance.wait_for_startup(60), which allows to perform multiple times a get on https://hostname:8443/ca/admin/ca/getStatus (until timeout is exhausted) - the timeout used to establish the connection when get(url) is called, probably defined at the system level.
deployer.instance.wait_for_startup(60)
get(url)
When first timeout < second timeout, the get(url) can be performed only once and fails. If first timeout > 2nd timeout, the get(url) can be performed a second time and the second time succeeds.
Metadata Update from @ftweedal: - Custom field component adjusted to None - Custom field feature adjusted to None - Custom field origin adjusted to None - Custom field proposedmilestone adjusted to None - Custom field proposedpriority adjusted to None - Custom field reviewer adjusted to None - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1515759 - Custom field type adjusted to None - Custom field version adjusted to None
Gerrit reviews:
Metadata Update from @ftweedal: - Custom field proposedmilestone adjusted to 10.5 (was: None)
Metadata Update from @mharmsen: - Issue priority set to: critical - Issue set to the milestone: 10.5
Pushed: - master: 1671d9c3b3b2bdd48fd74c3229c2869e5cfac80c - DOGTAG_10_5_BRANCH: abe1b5b0850115008a29ad54d82a25971293f932
Metadata Update from @ftweedal: - Issue close_status updated to: fixed
Metadata Update from @mharmsen: - Issue set to the milestone: 10.5.6 (was: 10.5)
Metadata Update from @mharmsen: - Issue priority set to: blocker (was: critical)
The patch apparently causes installation with HSM to fail since the timeout is too short. It probably should be implemented as a configurable parameter.
It has been reverted in the following commits:
Metadata Update from @edewata: - Issue set to the milestone: 10.5 (was: 10.5.6) - Issue status updated to: Open (was: Closed)
New gerrit review (take II): https://review.gerrithub.io/#/c/401523/
Pushed to master (132ddb4fcadcaa6eb8cbfc0e1f2a37b8b5654119)
Metadata Update from @mharmsen: - Issue set to the milestone: 10.6.0 (was: 10.5)
Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.
This issue has been cloned to GitHub and is available here: https://github.com/dogtagpki/pki/issues/3057
If you want to receive further updates on the issue, please navigate to the GitHub issue and click on Subscribe button.
Subscribe
Thank you for understanding, and we apologize for any inconvenience.
Login to comment on this ticket.