Sometimes installer cannot determine if dogtag is running, because there is no timeout on socket and if dogtag doesn't reply it stucks forever.
Log:
tail /var/log/ipaserver-kra-install.log 2017-03-15T05:16:21Z DEBUG Starting external process 2017-03-15T05:16:21Z DEBUG args=/bin/systemctl is-active pki-tomcatd@pki-tomcat.service 2017-03-15T05:16:21Z DEBUG Process finished, return code=0 2017-03-15T05:16:21Z DEBUG stdout=active 2017-03-15T05:16:21Z DEBUG stderr= 2017-03-15T05:16:21Z DEBUG wait_for_open_ports: localhost [8080, 8443] timeout 300 2017-03-15T05:16:23Z DEBUG Waiting until the CA is running 2017-03-15T05:16:23Z DEBUG request POST http://vm-239.example.com:8080/ca/admin/ca/getStatus 2017-03-15T05:16:23Z DEBUG request body ''
Traceback:
Traceback (most recent call first): File "/usr/lib64/python2.7/socket.py", line 480, in readline data = self._sock.recv(self._rbufsize) File "/usr/lib64/python2.7/httplib.py", line 394, in _read_status line = self.fp.readline(_MAXLINE + 1) File "/usr/lib64/python2.7/httplib.py", line 438, in begin version, status, reason = self._read_status() File "/usr/lib64/python2.7/httplib.py", line 1121, in getresponse response.begin() File "/usr/lib/python2.7/site-packages/ipapython/dogtag.py", line 200, in _httplib_request res = conn.getresponse() File "/usr/lib/python2.7/site-packages/ipapython/dogtag.py", line 170, in http_request 'http', host, port, url, httplib.HTTPConnection, body) File "/usr/lib/python2.7/site-packages/ipapython/dogtag.py", line 117, in ca_status ca_host, 8080, '/ca/admin/ca/getStatus') File "/usr/lib/python2.7/site-packages/ipaplatform/redhat/services.py", line 203, in wait_until_running status = dogtag.ca_status(api.env.host) File "/usr/lib/python2.7/site-packages/ipaplatform/redhat/services.py", line 224, in restart self.wait_until_running() File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 349, in restart self.service.restart(instance_name, capture_output=capture_output, wait=wait) File "/usr/lib/python2.7/site-packages/ipaserver/install/dogtaginstance.py", line 185, in restart_instance self.restart('pki-tomcat') File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 439, in run_step method() File "/usr/lib/python2.7/site-packages/ipaserver/install/service.py", line 449, in start_creation run_step(full_msg, method) File "/usr/lib/python2.7/site-packages/ipaserver/install/krainstance.py", line 126, in configure_instance self.start_creation(runtime=126) File "/usr/lib/python2.7/site-packages/ipaserver/install/kra.py", line 67, in install options.dm_password, subject_base=subject) File "/usr/lib/python2.7/site-packages/ipaserver/install/ipa_kra_install.py", line 225, in run kra.install(api, config, self.options) File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 172, in execute return_value = self.run() File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 149, in main return command.execute() File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 125, in run_cli sys.exit(cls.main(sys.argv)) File "/usr/sbin/ipa-kra-install", line 23, in <module> KRAInstall.run_cli()
NSSConnection did not support timeouts but since we're using httplib.HTTP[S]Connection now, the timeouts can be added.
NSSConnection
httplib.HTTP[S]Connection
Metadata Update from @pvoborni: - Issue priority set to: critical - Issue set to the milestone: FreeIPA 4.5.1
I have the same issue on replica-install, it hangs every time. So this is blocking me from creating a replica on 4.4
+1
Metadata Update from @stlaz: - Custom field external_tracker adjusted to https://pagure.io/dogtagpki/issue/2646
Seems to be this issue:
https://pagure.io/389-ds-base/issue/49209
It's hanging in ?
===================================================================== [...] Done configuring directory server (dirsrv). Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes 30 seconds [1/27]: creating certificate server user [2/27]: configuring certificate server instance [3/27]: stopping certificate server instance to update CS.cfg [4/27]: backing up CS.cfg [5/27]: disabling nonces [6/27]: set up CRL publishing [7/27]: enable PKIX certificate path discovery and validation [8/27]: starting certificate server instance =================================================================
We should take a pstack of ns-slapd process and confirm we see these sort of stacktraces:
Thread 32 (Thread 0x7f3e70ff9700 (LWP 3717)): #0 0x00007f3e9784e6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f3e97ea5473 in PR_EnterMonitor () from /lib64/libnspr4.so #2 0x00007f3e8c7e0321 in replica_get_generation () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #3 0x00007f3e8c7d9b90 in copy_operation_parameters () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #4 0x00007f3e8c7db1ea in multimaster_preop_modify () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #5 0x00007f3e99cdcdb8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #6 0x00007f3e99cdd043 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #7 0x00007f3e99cca359 in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f3e99ccb91f in do_modify () from /usr/lib64/dirsrv/libslapd.so.0 #9 0x00007f3e9a1b49d0 in connection_threadmain () #10 0x00007f3e97eaa9bb in _pt_root () from /lib64/libnspr4.so #11 0x00007f3e9784adc5 in start_thread () from /lib64/libpthread.so.0 #12 0x00007f3e9757973d in clone () from /lib64/libc.so.6 Thread 31 (Thread 0x7f3e6bfff700 (LWP 3718)): #0 0x00007f3e9784e6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f3e97ea5473 in PR_EnterMonitor () from /lib64/libnspr4.so #2 0x00007f3e8c7dfe50 in replica_is_updatedn () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #3 0x00007f3e8c7c6e2a in multimaster_extop_StartNSDS50ReplicationRequest () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #4 0x00007f3e9a1bbec4 in do_extended () #5 0x00007f3e9a1b4aca in connection_threadmain () #6 0x00007f3e97eaa9bb in _pt_root () from /lib64/libnspr4.so #7 0x00007f3e9784adc5 in start_thread () from /lib64/libpthread.so.0 #8 0x00007f3e9757973d in clone () from /lib64/libc.so.6 Thread 38 (Thread 0x7f3e80e9b700 (LWP 3710)): #0 0x00007f3e9784e6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f3e97ea5473 in PR_EnterMonitor () from /lib64/libnspr4.so #2 0x00007f3e8c7e2e58 in replica_update_state () from /usr/lib64/dirsrv/plugins/libreplication-plugin.so #3 0x00007f3e99ca2e9a in eq_loop () from /usr/lib64/dirsrv/libslapd.so.0 #4 0x00007f3e97eaa9bb in _pt_root () from /lib64/libnspr4.so #5 0x00007f3e9784adc5 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f3e9757973d in clone () from /lib64/libc.so.6
If it confirms, we have already a patch and test packages to deliver (389-ds-base) to fix this.
Approximately when will the patch land in the code? This issue blocks large part of freeipa integration tests execution during tests with replicas.
Metadata Update from @mbasti: - Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/753
@mkubik It might be better to ask at https://pagure.io/dogtagpki/issue/2646, we can't do anything to unblock you here.
edit: except for maybe failing sooner instead of endlessly
https://github.com/freeipa/freeipa/pull/753 does exactly that
Metadata Update from @pvoborni: - Issue assigned to mbasti
master:
ipa-4-5:
Metadata Update from @mbasti: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Metadata Update from @stlaz: - Issue status updated to: Open (was: Closed)
The installation will get stuck at a later step, during migration of cert profiles to LDAP. This is because we only log critical failure when the CA does not start in 5 minutes but don't actually fail. This should change.
Also, please, keep this ticket as a tracker to the Dogtag issue.
Metadata Update from @stlaz: - Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/763 (was: https://github.com/freeipa/freeipa/pull/753)
Metadata Update from @mbasti: - Issue tagged with: tracker
Is there a way to avoid this error to install the ca in the replica?
Metadata Update from @mbasti: - Issue set to the milestone: FreeIPA 4.5.2 (was: FreeIPA 4.5.1)
FreeIPA 4.5.1 has been released, moving to FreeIPA 4.5.2 milestone
Metadata Update from @tkrizek: - Issue set to the milestone: FreeIPA 4.5.3 (was: FreeIPA 4.5.2)
Metadata Update from @mbasti: - Assignee reset
Metadata Update from @tkrizek: - Issue set to the milestone: FreeIPA 4.5.4 (was: FreeIPA 4.5.3)
Metadata Update from @tkrizek: - Issue set to the milestone: FreeIPA 4.5.5 (was: FreeIPA 4.5.4)
The dogtag issue was marked as a duplicate of https://pagure.io/dogtagpki/issue/2644 which is now fixed. Marking this ticket as fixed.
Metadata Update from @rcritten: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.