The nightly tests for backup/restore sometimes fail checking that id admin properly finds the admin user after a restore. This happens because the SSSD backend may still be offline at that time. For an example and logs, see PR 4368 with logs here:
id admin
self = <ipatests.test_integration.test_backup_and_restore.TestBackupReinstallRestoreWithDNS object at 0x7f575263f090> def test_full_backup_reinstall_restore_with_DNS_zone(self): """backup, uninstall, reinstall, restore""" > self._full_backup_restore_with_DNS_zone(reinstall=True) test_integration/test_backup_and_restore.py:352: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ test_integration/test_backup_and_restore.py:340: in _full_backup_restore_with_DNS_zone tasks.resolve_record(self.master.ip, self.example2_test_zone) /usr/lib64/python3.7/contextlib.py:119: in __exit__ next(self.gen) test_integration/test_backup_and_restore.py:158: in restore_checker got = check(host) test_integration/test_backup_and_restore.py:89: in check_admin_in_id result = host.run_command(['id', 'admin']) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <ipatests.pytest_ipa.integration.host.Host master.ipa.test (master)> argv = ['id', 'admin'], set_env = True, stdin_text = None, log_stdout = True raiseonerr = True, cwd = None, bg = False, encoding = 'utf-8', ok_returncode = 0 def run_command(self, argv, set_env=True, stdin_text=None, log_stdout=True, raiseonerr=True, cwd=None, bg=False, encoding='utf-8', ok_returncode=0): """Wrapper around run_command to log stderr on raiseonerr=True :param ok_returncode: return code considered to be correct, you can pass an integer or sequence of integers """ result = super().run_command( argv, set_env=set_env, stdin_text=stdin_text, log_stdout=log_stdout, raiseonerr=False, cwd=cwd, bg=bg, encoding=encoding ) # in FIPS mode SSH may print noise to stderr, remove the string # "FIPS mode initialized" + optional newline. result.stderr_bytes = FIPS_NOISE_RE.sub(b'', result.stderr_bytes) try: result_ok = result.returncode in ok_returncode except TypeError: result_ok = result.returncode == ok_returncode if not result_ok and raiseonerr: result.log.error('stderr: %s', result.stderr_text) raise subprocess.CalledProcessError( result.returncode, argv, > result.stdout_text, result.stderr_text ) E subprocess.CalledProcessError: Command '['id', 'admin']' returned non-zero exit status 1.
Metadata Update from @frenaud: - Issue tagged with: test-failure, tests
Note: the issue happens since commit 1eb6a9b ipa-restore: restart services at the end Now a restore performs a double ipactl restart, meaning LDAP service is restarting at the end and SSSD is not re-connecting immediately to LDAP.
@frenaud I commented on a different nightly test about it and provided a solution:
The following three failed tests seems to have race condition with 389-ds restart:
fedora-latest/test_backup_and_restore_TestBackupReinstallRestoreWithDNS fedora-latest/test_backup_and_restore_TestBackupReinstallRestoreWithDNSSEC fedora-latest/test_backup_and_restore_TestBackupReinstallRestoreWithKRA
All of them have a sequence that tries to ensure certain operations continue working after a backup was restored. However, there is no waiting for IPA services to actually run properly before we start the checks. It might take some time to get 389-ds into functioning state and then sssd will take time to mark IPA domain as online.
The best way to solve it is to add waiting for a service to be operational to each check. Right now, it seems, we are hitting SSSD not recovering from offline LDAP server 'soon enough'. This might be ensured by calling sssctl domain-status ipa.test -o and waiting until it returns
sssctl domain-status ipa.test -o
Online status: Online
A loop might be for 10 runs with one or two seconds inbetween.
In short, I believe commit 1eb6a9b actually uncovered a real issue in the test.
Metadata Update from @frenaud: - Issue assigned to frenaud
Metadata Update from @frenaud: - Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/4383
master:
ipa-4-8:
Metadata Update from @abbra: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
ipa-4-7:
Login to comment on this ticket.