#50618 support cgroupv2
Closed: wontfix 2 years ago by firstyear. Opened 2 years ago by mreynolds.

Ticket was cloned from Red Hat Bugzilla (product Fedora): Bug 1754451

Description of problem:
Fedora 31 switched to cgroups v2 by default
https://fedoraproject.org/wiki/Changes/CGroupsV2
and it seems 389-ds still try to use v1


Version-Release number of selected component (if applicable):
rpm -q 389-ds-base
389-ds-base-1.4.1.7-1.fc31.x86_64

How reproducible:
Deterministic

Steps to Reproduce:
1. #install and configure 389-ds on default f31 (freeIPA can be used as well)
2. # restart service and check errors reported by 389-ds
3. journalctl --boot -u dirsrv@TESTRELM-TEST.service | grep ERR | grep
"/cgroup/"

Actual results:
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.139146386 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.141847902 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:10 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:10.143083444 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.166850637 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.168268079 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.169192166 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.171585443 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.172463685 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.173283725 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.174156213 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.174966221 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.175851582 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.176852584 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.177723911 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.178553805 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.179445679 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.180277361 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.181099980 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.181969820 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.182796747 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.183646082 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.184519478 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.185346321 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.186158473 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2
Sep 23 05:46:11 host.example.com ns-slapd[31454]:
[23/Sep/2019:05:46:11.192186822 -0400] - ERR - _spal_get_uint64_t_file - Unable
to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2

Expected results:
No cgroups related errors.

Additional info:

Metadata Update from @mreynolds:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1754451

2 years ago

We probably need to read the v2 as well, since we don't know if a system is on v1 or v2.

Metadata Update from @firstyear:
- Custom field origin adjusted to None
- Custom field reviewstatus adjusted to None

2 years ago

Metadata Update from @mreynolds:
- Issue set to the milestone: 1.4.2 (was: 0.0 NEEDS_TRIAGE)

2 years ago

Reading that document we probably want to attempt to read memory.high. I should see if suse has swapped to cgroup v2 ...

Okay, so to help out since I wrote the original cgroup code, I've setup a fedora 31 vm (gasp) and tested this configuration.

There are really 4 major cases:

  • dirsrv on fedora 31 with no limits
  • dirsrv on fedora 31 with systemd limits
  • dirsrv via podman with no limits
  • dirsrv via podman with limits

So, testing these combinations I can see:

- dirsrv on fedora 31 with no limits:


[root@localhost cgroup]# cat /proc/30061/cgroup
0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max
max
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high
max

- dirsrv on fedora 31 with 512M limit

[root@localhost cgroup]# cat /proc/30061/cgroup
0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max
max
[root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high
max

- dirsrv with podman no limit

[root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.max
max
[root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.high
max


- dirsrv with podman with 512M limit

[root@07fa546c6d43 cgroup]# cat memory.max
536870912
[root@07fa546c6d43 cgroup]# cat memory.high
max

So I'm already curious about the fact that the systemd memory limits don't really appear to work ....

Anyway, within the container it's easy - just read /sys/fs/cgroup/memory.max/high and see what they say.

I think that from within ns-slapd the same will be true for the system limits, as the cgroup should put the current limits for the container into the root.

I might start working on an initial patch now.

https://pagure.io/389-ds-base/pull-request/50885

Starting to work on thishere, but it's not ready to review yet.

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Beside alarming log in error log it can prevent DS to start and IIRC can crash DS -> Priority

Metadata Update from @tbordaz:
- Issue priority set to: critical

2 years ago

There's a lot of chatter in the errors log now:

[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Metadata Update from @mreynolds:
- Issue status updated to: Open (was: Closed)

2 years ago

There's a lot of chatter in the errors log now:
[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Also need to fix complier warning:

../389-ds-base/ldap/servers/slapd/slapi_pal.c: In function ‘spal_meminfo_get’:
../389-ds-base/ldap/servers/slapd/slapi_pal.c:349:59: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
349 | slapi_log_err(SLAPI_LOG_CRIT, "spal_meminfo_get", "Your system is reporting %" PRIu64" bytes available, which is less than the minimum recommended %" PRIu64 " bytes\n",
| ^~~~

There's a lot of chatter in the errors log now:
[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
[24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service

@firstyear Any objection to me changing this log level to a debug log level?

Nope, no objection :)

I'm doing this now @mreynolds will pr soon.

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

2 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/3673

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

a year ago

Login to comment on this ticket.

Metadata