#527 Extending available file descriptors within CBS
Closed: Fixed with Explanation 2 years ago by arrfab. Opened 2 years ago by sbonazzo.

Within oVirt project we're in the process of moving the main RPM delivery for the project to CentOS Virt SIG repos.
We hit an issue while building ovirt-engine on CBS:

Building locales requires more than 10240 available file descriptors, currently 1024

ovirt-engine build is compiling GWT permutations for multiple languages and browsers and it ends up with opening 10k files at the same time. So the restriction on 1024 is a blocker for us building oVirt engine in CBS. Is it possible to extend the file descriptor limit to 10240?


hmm, interesting but also impacting the infra .. let's see if we can find the easiest way to modify mock/koji . If you have already pointer, that will help :)

Perhaps

config_opts['files']['path/name/no/leading/slash'] = "put file contents here."

option in site-default.cfg can be used as for something similar to:

config_opts['files']['etc/security/limits.d/10-nofile.conf'] = """
mock hard nofile 10240
mock soft nofile 10240
"""

Didn't check if it works, I guess it would be easier asking copr maintainers how they did it if you have contacts there.

I was quickly looking and from your own build logs and configs exposed there, I found this :

# https://pagure.io/copr/copr/issue/1211
config_opts['nspawn_args'] += ['--rlimit=RLIMIT_NOFILE=10240']

So copr switched to systemd-nspawn, which is probably the case for koji and el8s but by koji doesn't let easily us modify the mock config files as they are generated on the fly. I'll see next week and eventually we can try a scratch build

Metadata Update from @arrfab:
- Issue tagged with: cbs, high-gain, medium-trouble

2 years ago

quick status update : our builders are using higher limits but don't know why internally to the chroot it's still reporting 1024 for your builds.
When trying to verify myself with a chroot through mock (using the same as you) :

mock -r /etc/mock/virt8s-test.cfg --isolation=simple shell
INFO: mock.py version 2.8 starting (python version = 3.6.8, NVR = mock-2.8-1.el8)...
Start(bootstrap): init plugins
INFO: selinux enabled
Finish(bootstrap): init plugins
Start: init plugins
INFO: selinux enabled
Finish: init plugins
INFO: Signal handler active
Start: run
Start(bootstrap): chroot init
INFO: calling preinit hooks
INFO: enabled HW Info plugin
Finish(bootstrap): chroot init
Start: chroot init
INFO: calling preinit hooks
INFO: enabled HW Info plugin
Finish: chroot init
Start: shell
<mock-chroot> sh-4.4# ulimit -n
10240
<mock-chroot> sh-4.4# su - mockbuild
Last login: Mon Nov 22 13:59:32 UTC 2021
[root@kojid-x86-02 ~]# ulimit -n
10240
[root@kojid-x86-02 ~]# exit
logout
<mock-chroot> sh-4.4# exit
logout
Finish: shell

so trying to see why your own build script reports 1024

Metadata Update from @arrfab:
- Issue assigned to arrfab

2 years ago

Makefile just executes ulimit -n to check that number.

Let me close it as it's now pushed and working.

Explanations : we already had our builders using higher values but due to upgrade to 8-stream and so newer mock, the default isolation is now auto but that means using systemd-nspawn (from initial site-defaults.cfg config : config_opts['isolation'] = 'auto')
So while Koji is configuring mock config files on the fly, there is no way to easily override values there but deploying a system-wide change is fixing the issue for nspawn isolation mode for koji.

I pushed the change (with boolean turned to true for cbs kojid builders) :
https://github.com/CentOS/ansible-role-kojid/commit/af8d35c7ce12193a6f645b37580b519074a0a17e

It was pushed live so it's now working and I submitted a scratch build :
https://cbs.centos.org/koji/taskinfo?taskID=2609104

As you can see, it failed but another reason this time : (from https://cbs.centos.org/kojifiles/work/tasks/9130/2609130/build.log)

Downloading from central: https://repo1.maven.org/maven2/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom
Downloading from jboss-public-repository-group: https://repository.jboss.org/nexus/content/groups/public-jboss/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom
Downloading from atlassian.public.repo: https://maven.atlassian.com/content/groups/public/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[ERROR] Non-resolvable import POM: Could not transfer artifact com.google.gwt:gwt:pom:2.9.0 from/to central (https://repo1.maven.org/maven2): transfer failed for https://repo1.maven.org/maven2/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom @ line 319, column 19
 @ 
[ERROR] The build could not read 1 project -> [Help 1]

That's a valid reason to fail : nothing should be downloading content from internet or somewhere else than from within the build environment and so available in the src.rpm (and so from SCM source where it's stored)

If it's working on COPR, that's because the builders have access to internet and let you access it at build time (which is a flag you can turn on but it's a bad idea) so you have first to either package as rpm these dependencies or at least create a tarball, push to lookaside and so have it available from within the src.rpm.

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

2 years ago

Thanks, I'm working on the next issue.

Login to comment on this ticket.

Metadata
Boards 1
CBS Status: Backlog