Issue #8264: Nightly test failure in test_integration.test_commands.TestIPACommand.test_hbac_systemd_user - freeipa

freeipa

#8264 Nightly test failure in test_integration.test_commands.TestIPACommand.test_hbac_systemd_user

Closed: fixed 3 years ago by cheimes. Opened 4 years ago by frenaud.

The azure nightly test failed in test_integration.test_commands.TestIPACommand.test_hbac_systemd_user in the PR #4511 [testing_master_lastest].
Logs available at the following location: link.
The test failed with the following output while calling ipa-server-upgrade:

self = <ipatests.test_integration.test_commands.TestIPACommand object at 0x7f3cc62e1450>

    def test_hbac_systemd_user(self):
        # https://pagure.io/freeipa/issue/7831
        tasks.kinit_admin(self.master)
        # check for presence
        self.master.run_command(
            ['ipa', 'hbacsvc-show', 'systemd-user']
        )
        result = self.master.run_command(
            ['ipa', 'hbacrule-show', 'allow_systemd-user', '--all']
        )
        lines = set(l.strip() for l in result.stdout_text.split('\n'))
        assert 'User category: all' in lines
        assert 'Host category: all' in lines
        assert 'Enabled: TRUE' in lines
        assert 'HBAC Services: systemd-user' in lines
        assert 'accessruletype: allow' in lines

        # delete both
        self.master.run_command(
            ['ipa', 'hbacrule-del', 'allow_systemd-user']
        )
        self.master.run_command(
            ['ipa', 'hbacsvc-del', 'systemd-user']
        )

        # run upgrade
>       result = self.master.run_command(['ipa-server-upgrade'])

test_integration/test_commands.py:721: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ipatests.pytest_ipa.integration.host.Host master1.ipa.test (master)>
argv = ['ipa-server-upgrade'], set_env = True, stdin_text = None
log_stdout = True, raiseonerr = True, cwd = None, bg = False, encoding = 'utf-8'
ok_returncode = 0

    def run_command(self, argv, set_env=True, stdin_text=None,
                    log_stdout=True, raiseonerr=True,
                    cwd=None, bg=False, encoding='utf-8', ok_returncode=0):
        """Wrapper around run_command to log stderr on raiseonerr=True

        :param ok_returncode: return code considered to be correct,
                              you can pass an integer or sequence of integers
        """
        result = super().run_command(
            argv, set_env=set_env, stdin_text=stdin_text,
            log_stdout=log_stdout, raiseonerr=False, cwd=cwd, bg=bg,
            encoding=encoding
        )
        # in FIPS mode SSH may print noise to stderr, remove the string
        # "FIPS mode initialized" + optional newline.
        result.stderr_bytes = FIPS_NOISE_RE.sub(b'', result.stderr_bytes)
        try:
            result_ok = result.returncode in ok_returncode
        except TypeError:
            result_ok = result.returncode == ok_returncode
        if not result_ok and raiseonerr:
            result.log.error('stderr: %s', result.stderr_text)
            raise subprocess.CalledProcessError(
                result.returncode, argv,
>               result.stdout_text, result.stderr_text
            )
E           subprocess.CalledProcessError: Command '['ipa-server-upgrade']' returned non-zero exit status 1.

pytest_ipa/integration/host.py:196: CalledProcessError

ipa-server-upgrade output:

Upgrading IPA:. Estimated time: 1 minute 30 seconds
  [1/11]: stopping directory server
  [2/11]: saving configuration
  [3/11]: disabling listeners
  [4/11]: enabling DS global lock
  [5/11]: disabling Schema Compat
  [6/11]: starting directory server
  [7/11]: updating schema
  [8/11]: upgrading server
  [9/11]: stopping directory server
  [10/11]: restoring configuration
  [11/11]: starting directory server
Done.
Update complete
Upgrading the configuration of the IPA services
[Verifying that root certificate is published]
[Migrate CRL publish directory]
CRL tree already moved
IPA server upgrade failed: Inspect /var/log/ipaupgrade.log and run command ipa-server-upgrade manually.

slev commented 4 years ago

This is related to memory pressure.
I see this error more and more often in upstream CI.

From dirsrv error log:

[08/Apr/2020:23:39:14.451281823 +0000] - INFO - spal_meminfo_get - Found cgroup v1
[08/Apr/2020:23:39:14.452118039 +0000] - CRIT - spal_meminfo_get - Your system is reporting 40963276 bytes available, which is less than the minimum recommended 134217728 bytes
[08/Apr/2020:23:39:14.453174279 +0000] - CRIT - spal_meminfo_get - This indicates heavy memory pressure or incorrect system resource allocation
[08/Apr/2020:23:39:14.454315291 +0000] - CRIT - spal_meminfo_get - Directory Server *may* crash as a result!!!
[08/Apr/2020:23:39:14.455219583 +0000] - WARN - dncache_set_max_size - Cachesize (20481638) may use more than the available physical memory.
[08/Apr/2020:23:39:14.455609550 +0000] - NOTICE - ldbm_back_start - total cache size: 256396410 B; 
[08/Apr/2020:23:39:14.456239236 +0000] - WARN - ldbm_back_start - It is highly likely your memory configuration of all backends will EXCEED your systems memory.
[08/Apr/2020:23:39:14.456708776 +0000] - WARN - ldbm_back_start - In a future release this WILL prevent server start up. You MUST alter your configuration.
[08/Apr/2020:23:39:14.457157223 +0000] - WARN - ldbm_back_start - Total entry cache size: 23690854 B; dbcache size: 4738170 B; available memory size: 47381708 B; 
[08/Apr/2020:23:39:14.457613868 +0000] - WARN - ldbm_back_start - This can be corrected by altering the values of nsslapd-cache-autosize, nsslapd-cache-autosize-split and nsslapd-dncachememsize

[08/Apr/2020:23:39:14.459108059 +0000] - INFO - spal_meminfo_get - Found cgroup v1
[08/Apr/2020:23:39:14.459864402 +0000] - CRIT - spal_meminfo_get - Your system is reporting 40963276 bytes available, which is less than the minimum recommended 134217728 bytes
[08/Apr/2020:23:39:14.460637939 +0000] - CRIT - spal_meminfo_get - This indicates heavy memory pressure or incorrect system resource allocation
[08/Apr/2020:23:39:14.460960229 +0000] - CRIT - spal_meminfo_get - Directory Server *may* crash as a result!!!
[08/Apr/2020:23:39:14.461637099 +0000] - INFO - bdb_start - Resizing db cache size: 5145312 -> 4738170

Now, I'm measuring the memory consumption for bare master+ca+kra+dnssec on F31.

cheimes commented 3 years ago

master:

43ac2d9 ipatests: Cleanup 'collect_logs' decorator
5da309e ipatests: Pretty print multihost config
63747bc ipatests: Collect all logs on all Unix hosts
87408ee Azure: Increase memory limit

Metadata Update from @cheimes:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

3 years ago

cheimes commented 3 years ago

ipa-4-8:

7b39e5f ipatests: enable 389-ds audit log and collect audit file
295d207 ipatests: Cleanup 'collect_logs' decorator
5da643f ipatests: Pretty print multihost config
736b8ba ipatests: Collect all logs on all Unix hosts
97581ec Azure: Increase memory limit

Metadata

Assignee

None

Tags

Blocking

None

Depending on

None

Priority

None

Milestone

None

affects_doc

None

source

None

knownissue

None

type

None

blockedby

None

test_case

None

component

None

blocking

None

on_review

None

keywords

None

test_coverage

None

reviewer

None

external_tracker

None

rhbz

None

tester

None

changelog

None

design

None

freeipa

Source Code

#8264 Nightly test failure in test_integration.test_commands.TestIPACommand.test_hbac_systemd_user Closed: fixed 3 years ago by cheimes. Opened 4 years ago by frenaud.

Metadata

test-failure

#8264 Nightly test failure in test_integration.test_commands.TestIPACommand.test_hbac_systemd_user

Closed: fixed 3 years ago by cheimes. Opened 4 years ago by frenaud.