The nightly test test_idp.py::TestIDPKeycloak::test_idp_backup_restore is failing when executed with the nightly build of 389ds that is using LDMB instead of Berkeley DB (389-ds-base-3.0.1-202402051504git8fe758664.fc39.x86_64). Seen in PR #3373, with the following logs and report:
test_idp.py::TestIDPKeycloak::test_idp_backup_restore
self = <ipatests.test_integration.test_idp.TestIDPKeycloak object at 0x7f4f277d8a10> def test_idp_backup_restore(self): """ Test case to check that after restore data is retrieved with related idp configuration. """ tasks.kinit_admin(self.master) user = "backupuser" cmd = ["ipa", "idp-add", "testidp", "--provider=keycloak", "--client-id=ipa_oidc_client", "--org=master", "--base-url={0}:8443/auth".format(self.client.hostname)] self.master.run_command(cmd, stdin_text="{0}\n{0}".format( self.client.config.admin_password)) tasks.user_add(self.master, user, extra_args=["--user-auth-type=idp", "--idp-user-id=testuser1@ipa.test", "--idp=testidp"] ) backup_path = tasks.get_backup_dir(self.master) # change data after backup self.master.run_command(['ipa', 'user-del', user]) self.master.run_command(['ipa', 'idp-del', 'testidp']) dirman_password = self.master.config.dirman_password > self.master.run_command(['ipa-restore', backup_path], stdin_text=dirman_password + '\nyes') test_integration/test_idp.py:300: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <ipatests.pytest_ipa.integration.host.Host master.ipa.test (master)> argv = ['ipa-restore', '/var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40'] set_env = True, stdin_text = 'Secret.123\nyes', log_stdout = True raiseonerr = True, cwd = None, bg = False, encoding = 'utf-8', ok_returncode = 0 def run_command(self, argv, set_env=True, stdin_text=None, log_stdout=True, raiseonerr=True, cwd=None, bg=False, encoding='utf-8', ok_returncode=0): """Wrapper around run_command to log stderr on raiseonerr=True :param ok_returncode: return code considered to be correct, you can pass an integer or sequence of integers """ result = super().run_command( argv, set_env=set_env, stdin_text=stdin_text, log_stdout=log_stdout, raiseonerr=False, cwd=cwd, bg=bg, encoding=encoding ) # in FIPS mode SSH may print noise to stderr, remove the string # "FIPS mode initialized" + optional newline. result.stderr_bytes = FIPS_NOISE_RE.sub(b'', result.stderr_bytes) try: result_ok = result.returncode in ok_returncode except TypeError: result_ok = result.returncode == ok_returncode if not result_ok and raiseonerr: result.log.error('stderr: %s', result.stderr_text) > raise subprocess.CalledProcessError( result.returncode, argv, result.stdout_text, result.stderr_text ) E subprocess.CalledProcessError: Command '['ipa-restore', '/var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40']' returned non-zero exit status 1. pytest_ipa/integration/host.py:202: CalledProcessError ----------------------------- Captured stderr call ----------------------------- ipa: ERROR: stderr: Preparing restore from /var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40 on master.ipa.test Performing FULL restore from FULL backup Temporary setting umask to 022 Each master will individually need to be re-initialized or re-created from this one. The replication agreements on masters running IPA 3.1 or earlier will need to be manually re-enabled. See the man page for details. Disabling all replication. Disabling replication agreement on replica1.ipa.test to master.ipa.test Disabling CA replication agreement on replica1.ipa.test to master.ipa.test Stopping IPA services Systemwide CA database updated. Restoring files Systemwide CA database updated. Restoring from userRoot in IPA-TEST ldif2db failed: Restoring from ipaca in IPA-TEST Restarting GSS-proxy Starting IPA services Restoring umask to 18 CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n") The ipa-restore command failed. See /var/log/iparestore.log for more information
Test scenario: - install master, client, replica - configure an external idp, add a user that can connect with the external idp - backup with ipa-backup -v --disable-role-check - delete user and external idp - call ipa-restore with ipa-restore /path/to/backup
ipa-backup -v --disable-role-check
ipa-restore /path/to/backup
The call to restore is failing and the iparestore.log show an error with ldif2db:
2024-02-05T22:31:02Z DEBUG Starting external process 2024-02-05T22:31:02Z DEBUG args=['/usr/sbin/dsctl', 'IPA-TEST', 'ldif2db', 'userRoot', '/var/lib/dirsrv/slapd-IPA-TEST/ldif/IPA-TEST-userRoot.ldif'] 2024-02-05T22:31:05Z DEBUG Process finished, return code=1 2024-02-05T22:31:05Z DEBUG stdout=ldif2db failed 2024-02-05T22:31:05Z DEBUG stderr= 2024-02-05T22:31:05Z CRITICAL ldif2db failed:
The error does not seem to be handled by ipa-restore which goes on an finally fails on a call to ipactl start:
ipactl start
2024-02-05T22:31:10Z INFO Starting IPA services 2024-02-05T22:31:10Z DEBUG Starting external process 2024-02-05T22:31:10Z DEBUG args=['/usr/sbin/ipactl', 'start'] 2024-02-05T22:31:39Z DEBUG Process finished, return code=1 2024-02-05T22:31:39Z DEBUG stdout=Starting Directory Service debugging enabled, suppressing output. 2024-02-05T22:31:39Z DEBUG stderr=Failed to read data from service file: Failed to get list of services to probe status! Configured hostname 'master.ipa.test' does not match any master server in LDAP: No master found because of error: no such entry Shutting down 2024-02-05T22:31:39Z INFO Restoring umask to 18 2024-02-05T22:31:39Z DEBUG File "/usr/lib/python3.12/site-packages/ipapython/admintool.py", line 180, in execute return_value = self.run() ^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/ipaserver/install/ipa_restore.py", line 468, in run run([paths.IPACTL, 'start']) File "/usr/lib/python3.12/site-packages/ipapython/ipautil.py", line 599, in run raise CalledProcessError( 2024-02-05T22:31:39Z DEBUG The ipa-restore command failed, exception: CalledProcessError: CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n") 2024-02-05T22:31:39Z ERROR CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n") 2024-02-05T22:31:39Z ERROR The ipa-restore command failed. See /var/log/iparestore.log for more information
@progier and @tbordaz FYI our tests detected a new regression with LMDB
System run out of memory but it is a bit weird ( Why / where are we trying to allocate so much memory (i.e nearly 32 Gb) (aka 150% of the db max size)
From: http://freeipa-org-pr-ci.s3-website.eu-central-1.amazonaws.com/jobs/330198a2-c472-11ee-9d77-fa163e0bd381/test_integration-test_idp.py-TestIDPKeycloak-test_idp_backup_restore/master.ipa.test/var/log/dirsrv/slapd-IPA-TEST/errors.gz
[05/Feb/2024:22:31:04.631472267 +0000] - INFO - dbmdb_make_env - MBD environment created with maxsize=21474836480. ... [05/Feb/2024:22:31:05.207590755 +0000] - INFO - dbmdb_import_producer - import userRoot: Finished scanning file "/var/lib/dirsrv/slapd-IPA-TEST/ldif/IPA-TEST-userRoot.ldif" (572 entries) [05/Feb/2024:22:31:05.250849775 +0000] - ERR - memory allocator - malloc of 34281996290 bytes failed; OS error 12 (Cannot allocate memory) The server has probably allocated all available virtual memory. To solve this problem, make more virtual memory available to your server, or reduce one or more of the following server configuration settings: nsslapd-cachesize (Database Settings - Maximum entries in cache) nsslapd-cachememsize (Database Settings - Memory available for cache) nsslapd-dbcachesize (LDBM Plug-in Settings - Maximum cache size) nsslapd-import-cachesize (LDBM Plug-in Settings - Import cache size). Can't recover; calling exit(1).
Reviewed the slapi_ch_malloc calls from import but it has not helped me to understand the problem. Created https://github.com/389ds/389-ds-base/issues/6073 to improve the logs and help us to identify the root cause if the problem reoccurs
FYI we didn't hit this issue in the recent runs.
Closing as it happened only once and was not seen since
Metadata Update from @frenaud: - Issue close_status updated to: worksforme - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.