The ipactl status command does more than just reporting the status of all IPA services. It also attempts to restart failed services. Since it sometimes takes a couple of seconds until a service process fails, the status command does not report broken services. I noticed the problem while I was looking into #7240.
ipactl status
ipa-dnskeysyncd
# date; ipactl status; date; systemctl status ipa-dnskeysyncd; date; sleep 5; systemctl status ipa-dnskeysyncd; date Tue Nov 7 13:37:53 CET 2017 Directory Service: RUNNING krb5kdc Service: RUNNING kadmin Service: RUNNING named Service: RUNNING httpd Service: RUNNING ipa-custodia Service: RUNNING ntpd Service: RUNNING pki-tomcatd Service: RUNNING ipa-otpd Service: RUNNING ipa-dnskeysyncd Service: RUNNING ipa: INFO: The ipactl command was successful Tue Nov 7 13:38:47 CET 2017 ● ipa-dnskeysyncd.service - IPA key daemon Loaded: loaded (/usr/lib/systemd/system/ipa-dnskeysyncd.service; disabled; vendor preset: disabled) Active: active (running) since Tue 2017-11-07 13:38:47 CET; 387ms ago Main PID: 72303 (ipa-dnskeysyncd) Tasks: 1 (limit: 19660) CGroup: /system.slice/ipa-dnskeysyncd.service └─72303 /usr/bin/python2 /usr/libexec/ipa/ipa-dnskeysyncd Nov 07 13:38:47 vm-193.abc.idm.lab.eng.brq.redhat.com systemd[1]: Started IPA key daemon. Tue Nov 7 13:38:47 CET 2017 ● ipa-dnskeysyncd.service - IPA key daemon Loaded: loaded (/usr/lib/systemd/system/ipa-dnskeysyncd.service; disabled; vendor preset: disabled) Active: activating (auto-restart) (Result: exit-code) since Tue 2017-11-07 13:38:51 CET; 1s ago Process: 72303 ExecStart=/usr/libexec/ipa/ipa-dnskeysyncd (code=exited, status=1/FAILURE) Main PID: 72303 (code=exited, status=1/FAILURE) Nov 07 13:38:51 vm-193.abc.idm.lab.eng.brq.redhat.com systemd[1]: ipa-dnskeysyncd.service: Failed with result 'exit-code'. Tue Nov 7 13:38:52 CET 2017
I don't think that ipactl status starts or restarts a service. Rather it waits for a service to start when status is 'activating'
in ipactl:
for svc in svc_list: svchandle = services.service(svc, api=api) try: if svchandle.is_running():
in services.py
def is_running(self, instance_name="", wait=True): instance = self.service_instance(instance_name, 'is-active') while True: try: result = ipautil.run( [paths.SYSTEMCTL, "is-active", instance], capture_output=True ) except ipautil.CalledProcessError as e: if e.returncode == 3 and 'activating' in str(e.output): time.sleep(SERVICE_POLL_INTERVAL) continue return False else: # activating if result.returncode == 3 and 'activating' in result.output: time.sleep(SERVICE_POLL_INTERVAL) continue # active if result.returncode == 0: return True # not active return False
/usr/lib/systemd/system/ipa-dnskeysyncd.service shows that the service ipa-dnskeysyncd will be restarted automatically on failure: Restart=on-failure
Sometimes systemctl is-active returns "active" before the service crashes (after numerous calls to is-active), and this allows ipactl status to exit without noticing any error, but after a long delay. We could add a maximum number of retries to exit the loop (this would ensure that ipactl status does not block for too long) but this would not fix the issue of a service crashing right after it has been seen as active.
Metadata Update from @rcritten: - Issue priority set to: normal - Issue set to the milestone: FreeIPA 4.8
Login to comment on this ticket.