Within oVirt project we're in the process of moving the main RPM delivery for the project to CentOS Virt SIG repos. We hit an issue while building ovirt-engine on CBS:
Building locales requires more than 10240 available file descriptors, currently 1024
ovirt-engine build is compiling GWT permutations for multiple languages and browsers and it ends up with opening 10k files at the same time. So the restriction on 1024 is a blocker for us building oVirt engine in CBS. Is it possible to extend the file descriptor limit to 10240?
For reference: https://cbs.centos.org/kojifiles/work/tasks/6621/2606621/build.log
Also for reference, builds in COPR are passing: https://copr.fedorainfracloud.org/coprs/ovirt/ovirt-master-snapshot/package/ovirt-engine/
hmm, interesting but also impacting the infra .. let's see if we can find the easiest way to modify mock/koji . If you have already pointer, that will help :)
Perhaps
config_opts['files']['path/name/no/leading/slash'] = "put file contents here."
option in site-default.cfg can be used as for something similar to:
config_opts['files']['etc/security/limits.d/10-nofile.conf'] = """ mock hard nofile 10240 mock soft nofile 10240 """
Didn't check if it works, I guess it would be easier asking copr maintainers how they did it if you have contacts there.
I was quickly looking and from your own build logs and configs exposed there, I found this :
# https://pagure.io/copr/copr/issue/1211 config_opts['nspawn_args'] += ['--rlimit=RLIMIT_NOFILE=10240']
So copr switched to systemd-nspawn, which is probably the case for koji and el8s but by koji doesn't let easily us modify the mock config files as they are generated on the fly. I'll see next week and eventually we can try a scratch build
thanks!
Metadata Update from @arrfab: - Issue tagged with: cbs, high-gain, medium-trouble
quick status update : our builders are using higher limits but don't know why internally to the chroot it's still reporting 1024 for your builds. When trying to verify myself with a chroot through mock (using the same as you) :
mock -r /etc/mock/virt8s-test.cfg --isolation=simple shell INFO: mock.py version 2.8 starting (python version = 3.6.8, NVR = mock-2.8-1.el8)... Start(bootstrap): init plugins INFO: selinux enabled Finish(bootstrap): init plugins Start: init plugins INFO: selinux enabled Finish: init plugins INFO: Signal handler active Start: run Start(bootstrap): chroot init INFO: calling preinit hooks INFO: enabled HW Info plugin Finish(bootstrap): chroot init Start: chroot init INFO: calling preinit hooks INFO: enabled HW Info plugin Finish: chroot init Start: shell <mock-chroot> sh-4.4# ulimit -n 10240 <mock-chroot> sh-4.4# su - mockbuild Last login: Mon Nov 22 13:59:32 UTC 2021 [root@kojid-x86-02 ~]# ulimit -n 10240 [root@kojid-x86-02 ~]# exit logout <mock-chroot> sh-4.4# exit logout Finish: shell
so trying to see why your own build script reports 1024
Metadata Update from @arrfab: - Issue assigned to arrfab
Makefile just executes ulimit -n to check that number.
ulimit -n
Let me close it as it's now pushed and working.
Explanations : we already had our builders using higher values but due to upgrade to 8-stream and so newer mock, the default isolation is now auto but that means using systemd-nspawn (from initial site-defaults.cfg config : config_opts['isolation'] = 'auto') So while Koji is configuring mock config files on the fly, there is no way to easily override values there but deploying a system-wide change is fixing the issue for nspawn isolation mode for koji.
auto
I pushed the change (with boolean turned to true for cbs kojid builders) : https://github.com/CentOS/ansible-role-kojid/commit/af8d35c7ce12193a6f645b37580b519074a0a17e
It was pushed live so it's now working and I submitted a scratch build : https://cbs.centos.org/koji/taskinfo?taskID=2609104
scratch
As you can see, it failed but another reason this time : (from https://cbs.centos.org/kojifiles/work/tasks/9130/2609130/build.log)
Downloading from central: https://repo1.maven.org/maven2/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom Downloading from jboss-public-repository-group: https://repository.jboss.org/nexus/content/groups/public-jboss/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom Downloading from atlassian.public.repo: https://maven.atlassian.com/content/groups/public/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom [ERROR] [ERROR] Some problems were encountered while processing the POMs: [ERROR] Non-resolvable import POM: Could not transfer artifact com.google.gwt:gwt:pom:2.9.0 from/to central (https://repo1.maven.org/maven2): transfer failed for https://repo1.maven.org/maven2/com/google/gwt/gwt/2.9.0/gwt-2.9.0.pom @ line 319, column 19 @ [ERROR] The build could not read 1 project -> [Help 1]
That's a valid reason to fail : nothing should be downloading content from internet or somewhere else than from within the build environment and so available in the src.rpm (and so from SCM source where it's stored)
If it's working on COPR, that's because the builders have access to internet and let you access it at build time (which is a flag you can turn on but it's a bad idea) so you have first to either package as rpm these dependencies or at least create a tarball, push to lookaside and so have it available from within the src.rpm.
Metadata Update from @arrfab: - Issue close_status updated to: Fixed with Explanation - Issue status updated to: Closed (was: Open)
Thanks, I'm working on the next issue.
Log in to comment on this ticket.