#9525 Nightly test failure (389ds) in test_idp.py::TestIDPKeycloak::test_idp_backup_restore
Closed: worksforme a year ago by frenaud. Opened a year ago by frenaud.

The nightly test test_idp.py::TestIDPKeycloak::test_idp_backup_restore is failing when executed with the nightly build of 389ds that is using LDMB instead of Berkeley DB (389-ds-base-3.0.1-202402051504git8fe758664.fc39.x86_64).
Seen in PR #3373, with the following logs and report:

self = <ipatests.test_integration.test_idp.TestIDPKeycloak object at 0x7f4f277d8a10>

    def test_idp_backup_restore(self):
        """
        Test case to check that after restore data is retrieved
        with related idp configuration.
        """
        tasks.kinit_admin(self.master)
        user = "backupuser"
        cmd = ["ipa", "idp-add", "testidp", "--provider=keycloak",
               "--client-id=ipa_oidc_client", "--org=master",
               "--base-url={0}:8443/auth".format(self.client.hostname)]
        self.master.run_command(cmd, stdin_text="{0}\n{0}".format(
            self.client.config.admin_password))

        tasks.user_add(self.master, user,
                       extra_args=["--user-auth-type=idp",
                                   "--idp-user-id=testuser1@ipa.test",
                                   "--idp=testidp"]
                       )

        backup_path = tasks.get_backup_dir(self.master)
        # change data after backup
        self.master.run_command(['ipa', 'user-del', user])
        self.master.run_command(['ipa', 'idp-del', 'testidp'])
        dirman_password = self.master.config.dirman_password
>       self.master.run_command(['ipa-restore', backup_path],
                                stdin_text=dirman_password + '\nyes')

test_integration/test_idp.py:300: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ipatests.pytest_ipa.integration.host.Host master.ipa.test (master)>
argv = ['ipa-restore', '/var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40']
set_env = True, stdin_text = 'Secret.123\nyes', log_stdout = True
raiseonerr = True, cwd = None, bg = False, encoding = 'utf-8', ok_returncode = 0

    def run_command(self, argv, set_env=True, stdin_text=None,
                    log_stdout=True, raiseonerr=True,
                    cwd=None, bg=False, encoding='utf-8', ok_returncode=0):
        """Wrapper around run_command to log stderr on raiseonerr=True

        :param ok_returncode: return code considered to be correct,
                              you can pass an integer or sequence of integers
        """
        result = super().run_command(
            argv, set_env=set_env, stdin_text=stdin_text,
            log_stdout=log_stdout, raiseonerr=False, cwd=cwd, bg=bg,
            encoding=encoding
        )
        # in FIPS mode SSH may print noise to stderr, remove the string
        # "FIPS mode initialized" + optional newline.
        result.stderr_bytes = FIPS_NOISE_RE.sub(b'', result.stderr_bytes)
        try:
            result_ok = result.returncode in ok_returncode
        except TypeError:
            result_ok = result.returncode == ok_returncode
        if not result_ok and raiseonerr:
            result.log.error('stderr: %s', result.stderr_text)
>           raise subprocess.CalledProcessError(
                result.returncode, argv,
                result.stdout_text, result.stderr_text
            )
E           subprocess.CalledProcessError: Command '['ipa-restore', '/var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40']' returned non-zero exit status 1.

pytest_ipa/integration/host.py:202: CalledProcessError

----------------------------- Captured stderr call -----------------------------
ipa: ERROR: stderr: Preparing restore from /var/lib/ipa/backup/ipa-full-2024-02-05-22-30-40 on master.ipa.test
Performing FULL restore from FULL backup
Temporary setting umask to 022
Each master will individually need to be re-initialized or
re-created from this one. The replication agreements on
masters running IPA 3.1 or earlier will need to be manually
re-enabled. See the man page for details.
Disabling all replication.
Disabling replication agreement on replica1.ipa.test to master.ipa.test
Disabling CA replication agreement on replica1.ipa.test to master.ipa.test
Stopping IPA services
Systemwide CA database updated.
Restoring files
Systemwide CA database updated.
Restoring from userRoot in IPA-TEST
ldif2db failed: 
Restoring from ipaca in IPA-TEST
Restarting GSS-proxy
Starting IPA services
Restoring umask to 18
CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n")
The ipa-restore command failed. See /var/log/iparestore.log for more information

Test scenario:
- install master, client, replica
- configure an external idp, add a user that can connect with the external idp
- backup with ipa-backup -v --disable-role-check
- delete user and external idp
- call ipa-restore with ipa-restore /path/to/backup

The call to restore is failing and the iparestore.log show an error with ldif2db:

2024-02-05T22:31:02Z DEBUG Starting external process
2024-02-05T22:31:02Z DEBUG args=['/usr/sbin/dsctl', 'IPA-TEST', 'ldif2db', 'userRoot', '/var/lib/dirsrv/slapd-IPA-TEST/ldif/IPA-TEST-userRoot.ldif']
2024-02-05T22:31:05Z DEBUG Process finished, return code=1
2024-02-05T22:31:05Z DEBUG stdout=ldif2db failed

2024-02-05T22:31:05Z DEBUG stderr=
2024-02-05T22:31:05Z CRITICAL ldif2db failed: 

The error does not seem to be handled by ipa-restore which goes on an finally fails on a call to ipactl start:

2024-02-05T22:31:10Z INFO Starting IPA services
2024-02-05T22:31:10Z DEBUG Starting external process
2024-02-05T22:31:10Z DEBUG args=['/usr/sbin/ipactl', 'start']
2024-02-05T22:31:39Z DEBUG Process finished, return code=1
2024-02-05T22:31:39Z DEBUG stdout=Starting Directory Service
    debugging enabled, suppressing output.

2024-02-05T22:31:39Z DEBUG stderr=Failed to read data from service file: Failed to get list of services to probe status!
Configured hostname 'master.ipa.test' does not match any master server in LDAP:
No master found because of error: no such entry
Shutting down

2024-02-05T22:31:39Z INFO Restoring umask to 18
2024-02-05T22:31:39Z DEBUG   File "/usr/lib/python3.12/site-packages/ipapython/admintool.py", line 180, in execute
    return_value = self.run()
                   ^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/ipaserver/install/ipa_restore.py", line 468, in run
    run([paths.IPACTL, 'start'])
  File "/usr/lib/python3.12/site-packages/ipapython/ipautil.py", line 599, in run
    raise CalledProcessError(

2024-02-05T22:31:39Z DEBUG The ipa-restore command failed, exception: CalledProcessError: CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n")
2024-02-05T22:31:39Z ERROR CalledProcessError(Command ['/usr/sbin/ipactl', 'start'] returned non-zero exit status 1: "Failed to read data from service file: Failed to get list of services to probe status!\nConfigured hostname 'master.ipa.test' does not match any master server in LDAP:\nNo master found because of error: no such entry\nShutting down\n")
2024-02-05T22:31:39Z ERROR The ipa-restore command failed. See /var/log/iparestore.log for more information

@progier and @tbordaz
FYI our tests detected a new regression with LMDB

System run out of memory but it is a bit weird ( Why / where are we trying to allocate so much memory (i.e nearly 32 Gb) (aka 150% of the db max size)

From: http://freeipa-org-pr-ci.s3-website.eu-central-1.amazonaws.com/jobs/330198a2-c472-11ee-9d77-fa163e0bd381/test_integration-test_idp.py-TestIDPKeycloak-test_idp_backup_restore/master.ipa.test/var/log/dirsrv/slapd-IPA-TEST/errors.gz

[05/Feb/2024:22:31:04.631472267 +0000] - INFO - dbmdb_make_env - MBD environment created with maxsize=21474836480.
...
[05/Feb/2024:22:31:05.207590755 +0000] - INFO - dbmdb_import_producer - import userRoot: Finished scanning file "/var/lib/dirsrv/slapd-IPA-TEST/ldif/IPA-TEST-userRoot.ldif" (572 entries)
[05/Feb/2024:22:31:05.250849775 +0000] - ERR - memory allocator - malloc of 34281996290 bytes failed; OS error 12 (Cannot allocate memory)
The server has probably allocated all available virtual memory. To solve
this problem, make more virtual memory available to your server, or reduce
one or more of the following server configuration settings:
nsslapd-cachesize (Database Settings - Maximum entries in cache)
nsslapd-cachememsize (Database Settings - Memory available for cache)
nsslapd-dbcachesize (LDBM Plug-in Settings - Maximum cache size)
nsslapd-import-cachesize (LDBM Plug-in Settings - Import cache size).
Can't recover; calling exit(1).

Reviewed the slapi_ch_malloc calls from import but it has not helped me to understand the problem.
Created https://github.com/389ds/389-ds-base/issues/6073 to improve the logs and help us to identify the root cause if the problem reoccurs

FYI we didn't hit this issue in the recent runs.

Closing as it happened only once and was not seen since

Metadata Update from @frenaud:
- Issue close_status updated to: worksforme
- Issue status updated to: Closed (was: Open)

a year ago

Log in to comment on this ticket.

Metadata