Ticket was cloned from Red Hat Bugzilla (product Fedora): Bug 1754451
Description of problem: Fedora 31 switched to cgroups v2 by default https://fedoraproject.org/wiki/Changes/CGroupsV2 and it seems 389-ds still try to use v1 Version-Release number of selected component (if applicable): rpm -q 389-ds-base 389-ds-base-1.4.1.7-1.fc31.x86_64 How reproducible: Deterministic Steps to Reproduce: 1. #install and configure 389-ds on default f31 (freeIPA can be used as well) 2. # restart service and check errors reported by 389-ds 3. journalctl --boot -u dirsrv@TESTRELM-TEST.service | grep ERR | grep "/cgroup/" Actual results: Sep 23 05:46:10 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:10.139146386 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:10 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:10.141847902 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:10 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:10.143083444 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.166850637 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.168268079 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.169192166 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.171585443 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.172463685 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.173283725 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.174156213 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.174966221 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.175851582 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.176852584 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.177723911 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.178553805 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.179445679 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.180277361 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.181099980 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.181969820 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.182796747 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.183646082 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.184519478 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.185346321 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.limit_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.186158473 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.usage_in_bytes". errno=2 Sep 23 05:46:11 host.example.com ns-slapd[31454]: [23/Sep/2019:05:46:11.192186822 -0400] - ERR - _spal_get_uint64_t_file - Unable to open file "/sys/fs/cgroup/memory/memory.soft_limit_in_bytes". errno=2 Expected results: No cgroups related errors. Additional info:
Metadata Update from @mreynolds: - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1754451
We probably need to read the v2 as well, since we don't know if a system is on v1 or v2.
Metadata Update from @firstyear: - Custom field origin adjusted to None - Custom field reviewstatus adjusted to None
Metadata Update from @mreynolds: - Issue set to the milestone: 1.4.2 (was: 0.0 NEEDS_TRIAGE)
Cgroup-v2: https://www.kernel.org/doc/html/v5.5/admin-guide/cgroup-v2.html#memory
Reading that document we probably want to attempt to read memory.high. I should see if suse has swapped to cgroup v2 ...
Okay, so to help out since I wrote the original cgroup code, I've setup a fedora 31 vm (gasp) and tested this configuration.
There are really 4 major cases:
So, testing these combinations I can see:
- dirsrv on fedora 31 with no limits: [root@localhost cgroup]# cat /proc/30061/cgroup 0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service [root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max max [root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high max - dirsrv on fedora 31 with 512M limit [root@localhost cgroup]# cat /proc/30061/cgroup 0::/system.slice/system-dirsrv.slice/dirsrv@localhost.service [root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.max max [root@localhost cgroup]# cat /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service/memory.high max - dirsrv with podman no limit [root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.max max [root@ee8e269a8360 cgroup]# cat /sys/fs/cgroup/memory.high max - dirsrv with podman with 512M limit [root@07fa546c6d43 cgroup]# cat memory.max 536870912 [root@07fa546c6d43 cgroup]# cat memory.high max
So I'm already curious about the fact that the systemd memory limits don't really appear to work ....
Anyway, within the container it's easy - just read /sys/fs/cgroup/memory.max/high and see what they say.
I think that from within ns-slapd the same will be true for the system limits, as the cgroup should put the current limits for the container into the root.
I might start working on an initial patch now.
https://pagure.io/389-ds-base/pull-request/50885
Starting to work on thishere, but it's not ready to review yet.
Metadata Update from @firstyear: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Beside alarming log in error log it can prevent DS to start and IIRC can crash DS -> Priority
Metadata Update from @tbordaz: - Issue priority set to: critical
1671dc0..0e6a04a master cdd6267..1c3db9c 389-ds-base-1.4.2
There's a lot of chatter in the errors log now:
[24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
@firstyear Any objection to me changing this log level to a debug log level?
Metadata Update from @mreynolds: - Issue status updated to: Open (was: Closed)
There's a lot of chatter in the errors log now: [24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service @firstyear Any objection to me changing this log level to a debug log level?
There's a lot of chatter in the errors log now: [24/Feb/2020:15:13:58.739943389 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.742666844 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.745476335 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.748285674 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.750973383 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.753678243 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.759827616 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service [24/Feb/2020:15:13:58.762507500 -0500] - INFO - spal_meminfo_get - Found cgroup v2 -> /sys/fs/cgroup/system.slice/system-dirsrv.slice/dirsrv@localhost.service
Also need to fix complier warning:
../389-ds-base/ldap/servers/slapd/slapi_pal.c: In function ‘spal_meminfo_get’: ../389-ds-base/ldap/servers/slapd/slapi_pal.c:349:59: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=] 349 | slapi_log_err(SLAPI_LOG_CRIT, "spal_meminfo_get", "Your system is reporting %" PRIu64" bytes available, which is less than the minimum recommended %" PRIu64 " bytes\n", | ^~~~
Nope, no objection :)
I'm doing this now @mreynolds will pr soon.
https://pagure.io/389-ds-base/pull-request/50913
Clean ups completed :)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/3673
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Login to comment on this ticket.