#5758 Replica installation crashes on certmonger timeout
Closed: Fixed None Opened 8 years ago by mkubik.

The problem occurs in topology that has at least two replicas.
Installation of first replica succeeds, second replica fails on timeout when waiting for certmonger.

Installation output:

[2016-03-24T02:40:07Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: Update succeeded
[2016-03-24T02:40:07Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: 
[2016-03-24T02:40:07Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [29/43]: adding sasl mappings to the directory
[2016-03-24T02:40:09Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [30/43]: updating schema
[2016-03-24T02:40:09Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [31/43]: setting Auto Member configuration
[2016-03-24T02:40:09Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [32/43]: enabling S4U2Proxy delegation
[2016-03-24T02:40:09Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [33/43]: importing CA certificates from LDAP
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [34/43]: initializing group membership
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [35/43]: adding master entry
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [36/43]: initializing domain level
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [37/43]: configuring Posix uid/gid generation
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [38/43]: adding replication acis
[2016-03-24T02:40:10Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [39/43]: enabling compatibility plugin
[2016-03-24T02:40:12Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [40/43]: activating sidgen plugin
[2016-03-24T02:40:12Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [41/43]: activating extdom plugin
[2016-03-24T02:40:12Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [42/43]: tuning directory server
[2016-03-24T02:40:23Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>:   [43/43]: configuring directory to start on boot
[2016-03-24T02:40:23Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: Done configuring directory server (dirsrv).
[2016-03-24T02:40:49Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: Your system may be partly configured.
[2016-03-24T02:40:49Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: Run /usr/sbin/ipa-server-install --uninstall to clean up.
[2016-03-24T02:40:49Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: 
[2016-03-24T02:40:49Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: ipa.ipapython.install.cli.install_tool(Replica): ERROR    request timed out
[2016-03-24T02:40:49Z ipa.ipatests.test_integration.host.Host.vm-244.cmd22] <DEBUG>: ipa.ipapython.install.cli.install_tool(Replica): ERROR    The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information
ipa: ERROR: Exit code: 1

Relevant part of installation log:
http://fpaste.org/344607/14588273/


From systemd journal on a replica experiencing this issue:

Apr 22 10:36:24 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4628]: GSSAPI client step 1
Apr 22 10:36:24 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4628]: GSSAPI client step 1
Apr 22 10:36:24 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4628]: GSSAPI client step 1
Apr 22 10:36:24 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4628]: GSSAPI client step 1
Apr 22 10:36:24 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4628]: GSSAPI client step 2
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4723]: GSSAPI client step 1
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4723]: GSSAPI client step 1
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4723]: GSSAPI client step 1
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4723]: GSSAPI client step 1
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com ipa-submit[4723]: GSSAPI client step 2
Apr 22 10:36:36 vm-035.idm.lab.eng.brq.redhat.com certmonger[4625]: 2016-04-22 10:36:36 [4625] Server at https://<master_fqdn>/ipa/xml failed request, will retry: 4001 (RPC failed at server. CA is not configured

I've run into the same or nearly identical problem trying to setup my first replica. I have installed FreeIPA 4.3.1-1 using the copr source @freeipa / freeipa-4-3. When I attempted to setup the replica of the domain 1 primary It failed on step 26 of 43 with a timeout. This errors in the ipareplica-install.log are almost identical to the already attached file. For some reason my line numbers are off by 1 in the next to last stack trace.

I tried running some of the commands and I think the problem is related to incorrect parameter passing to certmonger. For example, give the following log information:

2016-05-04T20:32:43Z DEBUG   [26/43]: retrieving DS Certificate
2016-05-04T20:32:43Z DEBUG Loading Index file from '/var/lib/ipa/sysrestore/sysrestore.index'
2016-05-04T20:32:43Z DEBUG Starting external process
2016-05-04T20:32:43Z DEBUG args=/usr/bin/certutil -d /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM/ -L -n IPA.SUNBIRDDCIM.COM IPA CA -a
2016-05-04T20:32:43Z DEBUG Process finished, return code=255
2016-05-04T20:32:43Z DEBUG stdout=
2016-05-04T20:32:43Z DEBUG stderr=certutil: Could not find cert: IPA.SUNBIRDDCIM.COM IPA CA
: PR_FILE_NOT_FOUND_ERROR: File not found

2016-05-04T20:32:43Z DEBUG Starting external process
2016-05-04T20:32:43Z DEBUG args=/usr/bin/certutil -d /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM/ -N -f /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM//pwdfile.txt
2016-05-04T20:32:43Z DEBUG Process finished, return code=0
2016-05-04T20:32:43Z DEBUG stdout=
2016-05-04T20:32:43Z DEBUG stderr=
2016-05-04T20:32:43Z DEBUG Starting external process
2016-05-04T20:32:43Z DEBUG args=/usr/bin/certutil -d /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM/ -A -n IPA.SUNBIRDDCIM.COM IPA CA -t CT,C,C -a
2016-05-04T20:32:43Z DEBUG Process finished, return code=0
2016-05-04T20:32:43Z DEBUG stdout=
2016-05-04T20:32:43Z DEBUG stderr=
2016-05-04T20:32:43Z DEBUG certmonger request is in state dbus.String(u'NEWLY_ADDED_READING_CERT', variant_level=1)
2016-05-04T20:32:48Z DEBUG certmonger request is in state dbus.String(u'SUBMITTING', variant_level=1)
2016-05-04T20:32:53Z DEBUG certmonger request is in state dbus.String(u'SUBMITTING', variant_level=1)
2016-05-04T20:32:58Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 447, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 437, in run_step
    method()
  File "/usr/lib/python2.7/site-packages/ipaserver/install/dsinstance.py", line 1237, in __get_ds_cert
    db.request_service_cert(self.nickname, self.principal, self.fqdn)
  File "/usr/lib/python2.7/site-packages/ipaserver/install/certs.py", line 658, in request_service_cert
    certmonger.wait_for_request(reqid, timeout=15)
  File "/usr/lib/python2.7/site-packages/ipapython/certmonger.py", line 538, in wait_for_request
    raise RuntimeError("request timed out")
RuntimeError: request timed out

If I try running the first certutil command from a root shell it will succeed or fail depending on the arguments are passed:

[root@ipa-server1 ~]# /usr/bin/certutil -d /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM/ -L -n 'IPA.SUNBIRDDCIM.COM IPA CA' -a
-----BEGIN CERTIFICATE-----
MIIDqzCCApOgAwIBAgIBATANBgkqhkiG9w0BAQsFADA+MRwwGgYDVQQKDBNJUEEu
U1VOQklSRERDSU0uQ09NMR4wHAYDVQQDDBVDZXJ0aWZpY2F0ZSBBdXRob3JpdHkw
HhcNMTYwNDI5MTc0NTIwWhcNMzYwNDI5MTc0NTIwWjA+MRwwGgYDVQQKDBNJUEEu
U1VOQklSRERDSU0uQ09NMR4wHAYDVQQDDBVDZXJ0aWZpY2F0ZSBBdXRob3JpdHkw
ggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDRRxQmF7BqE6wpCwpBhQlK
AlUZl8z9xpawqQhpx7R9DLBSp9gyejl8emlvr6teGGEHUURXECxpICYcb1LFffT6
LPh7+qCcy3PXjWcwnfk+knfF49GgEZMxaUsGZB33UEKiUvet9HzJuWavp7RIek9X
srdSB/ZZWB+QYDTIKfRqSv7YJnG8At/jtndzYZNbBAN1WZLppy/C4T8ejOdKZjoE
SacDnAl4vdcXD4b+qEGRZERdMzFgJouDansWoWH0HCZHeYYm/LsILYdpswitHNIl
R8cQdpL1QDmcW8GwecS8VESD0ch7TqjlEOP4uVninZ9DuXetzoQonP/8F7+xN2HV
AgMBAAGjgbMwgbAwHwYDVR0jBBgwFoAUExkWe5GM2qwsgInlwqOel69sgMMwDwYD
VR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAcYwHQYDVR0OBBYEFBMZFnuRjNqs
LICJ5cKjnpevbIDDME0GCCsGAQUFBwEBBEEwPzA9BggrBgEFBQcwAYYxaHR0cDov
L2lwYS1zZXJ2ZXIwLmlwYS5zdW5iaXJkZGNpbS5jb206ODAvY2Evb2NzcDANBgkq
hkiG9w0BAQsFAAOCAQEAw9WRoVN6okADmO7vxO5+htGFYAO8Gj5LPCn6F+VEAcoF
zK+8tO4aVPthSpYbSFW3a3oqZM8RtDBJX3FKUaGWoOonh4elhTaBfUzfLQsEliEm
sM9zTiGuarW7hOyF6NKo/02f5gB/JqHIyelL0Gspzd63c+Xglq/CEihXI2uyoU77
Q47fauA2CAr6VxH0+pMvjiVntE5tlQVC1DEQZeFp+3wPqSFId6B8RNTVe//U3RI8
dxQQNpbWpWzjVsBxmRD3hDk7fVvBBIbhj9O2ebY+kPy4P1cKTGl5YVXyq1OFQupy
kvO5e3KZu2u22LY1gR9WmXr2kEzg92K3WWlO/Cw3AQ==
-----END CERTIFICATE-----
[root@ipa-server1 ~]# /usr/bin/certutil -d /etc/dirsrv/slapd-IPA-SUNBIRDDCIM-COM/ -L -n IPA.SUNBIRDDCIM.COM IPA CA -a
certutil: Could not find cert: IPA.SUNBIRDDCIM.COM
: PR_FILE_NOT_FOUND_ERROR: File not found

By using single quotes to ensure the shell passes IPA.SUNBIRDDCIM.COM IPA CA as a single argument the command will complete. If I don't use the single quotes then the command fails as the quoted argument is turned into three arguments.

Well I did some additional digging. It looks as if the scripts were configured for a wait time of just 15 seconds when it was taking about 25 to 30 seconds in my environment. See the following for the changes I made to correct the issue in my environment.

freeipa$ git diff
diff --git a/ipapython/ipautil.py b/ipapython/ipautil.py
index 4597f1c..8a596ab 100644
--- a/ipapython/ipautil.py
+++ b/ipapython/ipautil.py
@@ -394,7 +394,7 @@ def run(args, stdin=None, raiseonerr=True, nolog=(), env=None,
         # to execute the requested command.
         args[0:0] = [paths.BIN_TIMEOUT, str(timeout)]

-    arg_string = nolog_replace(' '.join(_log_arg(a) for a in args), nolog)
+    arg_string = nolog_replace(' '.join(_log_arg(shell_quote(a)) for a in args), nolog)
     root_logger.debug('Starting external process')
     root_logger.debug('args=%s' % arg_string)

diff --git a/ipaserver/install/certs.py b/ipaserver/install/certs.py
index 76f6275..2d6ca09 100644
--- a/ipaserver/install/certs.py
+++ b/ipaserver/install/certs.py
@@ -654,7 +654,7 @@ class CertDB(object):
                                         subject=host,
                                         passwd_fname=self.passwd_fname)
         # Now wait for the cert to appear. Check three times then abort
-        certmonger.wait_for_request(reqid, timeout=15)
+        certmonger.wait_for_request(reqid)


 class _CrossProcessLock(object):

master:

  • 0ba9e72 Increased certmonger timeout

ipa-4-3:

  • 084340b Increased certmonger timeout

Metadata Update from @mkubik:
- Issue assigned to someone
- Issue set to the milestone: FreeIPA 4.3.2

7 years ago

Login to comment on this ticket.

Metadata