#50618 support cgroupv2
Closed: fixed 5 months ago by firstyear. Opened 11 months ago by mreynolds.

Ticket was cloned from Red Hat Bugzilla (product Fedora): Bug 1754451

Description of problem:
Fedora 31 switched to cgroups v2 by default
https://fedoraproject.org/wiki/Changes/CGroupsV2
and it seems 389-ds still try to use v1


Version-Release number of selected component (if applicable):
rpm -q 389-ds-base
389-ds-base-1.4.1.7-1.fc31.x86_64

How reproducible:
Deterministic

Steps to Reproduce:
1. #install and configure 389-ds on default f31 (freeIPA can be used as well)
2. # restart service and check errors reported by 389-ds
3. journalctl --boot -u dirsrv@TESTRELM-TEST.service | grep ERR | grep
"/cgroup/"

Actual results:
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.139146386 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.141847902 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.143083444 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.166850637 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.168268079 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.169192166 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.171585443 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.172463685 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.173283725 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.174156213 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.174966221 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.175851582 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.176852584 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.177723911 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.178553805 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.179445679 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.180277361 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.181099980 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.181969820 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.182796747 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.183646082 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.184519478 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.185346321 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.186158473 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.192186822 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2

Expected results:
No cgroups related errors.

Additional info:

Metadata Update from @mreynolds:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1754451

11 months ago

We probably need to read the v2 as well, since we don't know if a system is on v1 or v2.

Metadata Update from @firstyear:
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None

11 months ago

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.2 (was: 0.0 NEEDS_TRIAGE)

11 months ago

Reading that document we probably want to attempt to read memory.high. I should see if suse has swapped to cgroup v2 ...

Okay, so to help out since I wrote the original cgroup code, I've setup a fedora 31 vm (gasp) and tested this configuration.

There are really 4 major cases:

  • dirsrv on fedora 31 with no limits
  • dirsrv on fedora 31 with systemd limits
  • dirsrv via podman with no limits
  • dirsrv via podman with limits

So, testing these combinations I can see:

- dirsrv on fedora 31 with no limits:


[root@localhost cgroup]# cat /proc/30061/cgroup
0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max
max
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high
max

- dirsrv on fedora 31 with 512M limit

[root@localhost cgroup]# cat /proc/30061/cgroup
0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max
max
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high
max

- dirsrv with podman no limit

[root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.max
max
[root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.high
max


- dirsrv with podman with 512M limit

[root@07fa546c6d43 cgroup]# cat memory.max
536870912
[root@07fa546c6d43 cgroup]# cat memory.high
max

So I'm already curious about the fact that the systemd memory limits don't really appear to work ....

Anyway, within the container it's easy - just read /sys/fs/cgroup/memory.max/high and see what they say.

I think that from within ns-slapd the same will be true for the system limits, as the cgroup should put the current limits for the container into the root.

I might start working on an initial patch now.

https://pagure.io/389-ds-base/pull-request/50885

Starting to work on thishere, but it's not ready to review yet.

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

6 months ago

Beside alarming log in error log it can prevent DS to start and IIRC can crash DS -> Priority

Metadata Update from @tbordaz:
- Issue priority set to: critical

6 months ago

There's a lot of chatter in the errors log now:

[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Metadata Update from @mreynolds:
- Issue status updated to: Open (was: Closed)

6 months ago

There's a lot of chatter in the errors log now:
[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Also need to fix complier warning:

../389-ds-base/ldap/servers/slapd/slapi_pal.c: In function ‘spal_meminfo_get’:
../389-ds-base/ldap/servers/slapd/slapi_pal.c:349:59: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
349 | slapi_log_err(SLAPI_LOG_CRIT, "spal_meminfo_get", "Your system is reporting %" PRIu64" bytes available, which is less than the minimum recommended %" PRIu64 " bytes\n",
| ^~~~

There's a lot of chatter in the errors log now:
[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Nope, no objection :)

I'm doing this now @mreynolds will pr soon.

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

5 months ago

Login to comment on this ticket.

Metadata