#7241 'ipactl status' restarts service and fails to report broken service
Opened 2 years ago by cheimes. Modified 2 years ago

The ipactl status command does more than just reporting the status of all IPA services. It also attempts to restart failed services. Since it sometimes takes a couple of seconds until a service process fails, the status command does not report broken services. I noticed the problem while I was looking into #7240.

Example

  • ipactl status runs for almost 55 seconds
  • from the process timestamp it is obvious that ipactl status has attempted to start ipa-dnskeysyncd at least once
  • since it takes two, three seconds for ipa-dnskeysyncd to fail, the service is never shown as failing. This makes it hard to debug issues.
# date; ipactl status; date; systemctl status ipa-dnskeysyncd; date; sleep 5; systemctl status ipa-dnskeysyncd; date
Tue Nov  7 13:37:53 CET 2017
Directory Service: RUNNING
krb5kdc Service: RUNNING
kadmin Service: RUNNING
named Service: RUNNING
httpd Service: RUNNING
ipa-custodia Service: RUNNING
ntpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa-dnskeysyncd Service: RUNNING
ipa: INFO: The ipactl command was successful
Tue Nov  7 13:38:47 CET 2017
● ipa-dnskeysyncd.service - IPA key daemon
   Loaded: loaded (/usr/lib/systemd/system/ipa-dnskeysyncd.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-11-07 13:38:47 CET; 387ms ago
 Main PID: 72303 (ipa-dnskeysyncd)
    Tasks: 1 (limit: 19660)
   CGroup: /system.slice/ipa-dnskeysyncd.service
           └─72303 /usr/bin/python2 /usr/libexec/ipa/ipa-dnskeysyncd

Nov 07 13:38:47 vm-193.abc.idm.lab.eng.brq.redhat.com systemd[1]: Started IPA key daemon.
Tue Nov  7 13:38:47 CET 2017
● ipa-dnskeysyncd.service - IPA key daemon
   Loaded: loaded (/usr/lib/systemd/system/ipa-dnskeysyncd.service; disabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Tue 2017-11-07 13:38:51 CET; 1s ago
  Process: 72303 ExecStart=/usr/libexec/ipa/ipa-dnskeysyncd (code=exited, status=1/FAILURE)
 Main PID: 72303 (code=exited, status=1/FAILURE)

Nov 07 13:38:51 vm-193.abc.idm.lab.eng.brq.redhat.com systemd[1]: ipa-dnskeysyncd.service: Failed with result 'exit-code'.
Tue Nov  7 13:38:52 CET 2017

I don't think that ipactl status starts or restarts a service. Rather it waits for a service to start when status is 'activating'

in ipactl:

    for svc in svc_list:
        svchandle = services.service(svc, api=api)
        try:
            if svchandle.is_running():

in services.py

    def is_running(self, instance_name="", wait=True):
        instance = self.service_instance(instance_name, 'is-active')

        while True:
            try:
                result = ipautil.run(
                    [paths.SYSTEMCTL, "is-active", instance],
                    capture_output=True
                )
            except ipautil.CalledProcessError as e:
                if e.returncode == 3 and 'activating' in str(e.output):
                    time.sleep(SERVICE_POLL_INTERVAL)
                    continue
                return False
            else:
                # activating
                if result.returncode == 3 and 'activating' in result.output:
                    time.sleep(SERVICE_POLL_INTERVAL)
                    continue
                # active
                if result.returncode == 0:
                    return True
                # not active
                return False

/usr/lib/systemd/system/ipa-dnskeysyncd.service shows that the service ipa-dnskeysyncd will be restarted automatically on failure:
Restart=on-failure

Sometimes systemctl is-active returns "active" before the service crashes (after numerous calls to is-active), and this allows ipactl status to exit without noticing any error, but after a long delay.
We could add a maximum number of retries to exit the loop (this would ensure that ipactl status does not block for too long) but this would not fix the issue of a service crashing right after it has been seen as active.

Metadata Update from @rcritten:
- Issue priority set to: normal
- Issue set to the milestone: FreeIPA 4.8

2 years ago

Login to comment on this ticket.

Metadata