#6602 bodhi masher failed to mash f25-updates-testing
Closed: Fixed 7 years ago Opened 7 years ago by parasense.

Bodhi failed the f25-testing push due to errors around ostree. There appears to be updated ostree in the f25 update push (testing), which is suspect but not verified.

I'm attaching the log file for your analysis:

/srv/fedora-atomic/logs/25/x86_64/updates-testing/docker-host/f25-updates-testing-170125.0315
(attached)

Also giving the mock root.log if that helps:
/var/lib/mock/fedora-25-updates-testing-x86_64/result/root.log

root.log


This is the fatal error:

2017-01-25 03:17:08,168 -  INFO - composer.py:236 - Can't create loopback device

Which comes from bubblewrap. Offhand, my best guess is that newer systemd introduced more seccomp filtering, probably denying PF_NETLINK or raw sockets.

What are the versions of mock and systemd involved here?

What are the versions of mock and systemd involved here?

rpm -q mock systemd

mock-1.3.3-1.fc25.noarch
systemd-231-12.fc25.x86_64

So I ran the systemd-nspawn manually and got the same error:


[root@bodhi-backend01 ~][PROD]# /usr/bin/systemd-nspawn -q -M 0060dfca002248b8ad3ac734bf76fd53 -D /var/lib/mock/fedora-25-updates-testing-x86_64/root --setenv=HOME=/builddir --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=PS1='<mock-chroot> \s-\v\$' --setenv=LANG=en_US.UTF-8 --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=PROMPT_COMMAND='printf "\033]0;<mock-chroot>\007"' --setenv=HOSTNAME=mock /bin/sh -c "/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpfRsYiP/fedora-atomic.git/treefile.json"
Can't create loopback device
error: bwrap test failed, see https://github.com/projectatomic/rpm-ostree/pull/429: Executing bwrap(true): Child process exited with code 1

The loop module was not loaded, so loaded that...


[root@bodhi-backend01 ~][PROD]# modprobe loop
[root@bodhi-backend01 ~][PROD]# lsmod
Module Size Used by
loop 28672 0

The systemd-nspawn had the same result, so I ran a manual loop-dev test case:


[root@bodhi-backend01 ~][PROD]# truncate -s 100M /tmp/foo.parasense
[root@bodhi-backend01 ~][PROD]# losetup /tmp/foo.parasense
losetup: /tmp/foo.parasense: failed to use device: No such device

The loop module is for filesystems, not network. It's not related.

OK, with https://github.com/projectatomic/bubblewrap/pull/166 I get:

# /srv/walters/src/github/projectatomic/bubblewrap/bwrap --unshare-net  --bind / / true
loopback: Failed RTM_NEWADDR: No child processes

Hm, not quite right since it looks like the error is from netlink, not a syscall. Still looking.

Ok, this is because nspawn is dropping out CAP_NET_ADMIN. Which we only need to create our own network, indepenent of the host. The quick fix here is to add:

--capability=CAP_NET_ADMIN to the arguments for systemd-nspawn. Can someone do that patch for fedmsg-atomic-composer?

I ran the --capability=CAP_NET_ADMIN manually, but it gave an error.


[root@bodhi-backend01 ~][PROD]# /usr/bin/systemd-nspawn -q -M 0060dfca002248b8ad3ac734bf76fd53 -D /var/lib/mock/fedora-25-updates-testing-x86_64/root --capability=CAP_NET_ADMIN --setenv=HOME=/builddir --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=PS1='<mock-chroot> \s-\v\$' --setenv=LANG=en_US.UTF-8 --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=PROMPT_COMMAND='printf "\033]0;<mock-chroot>\007"' --setenv=HOSTNAME=mock /bin/sh -c "/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpfRsYiP/fedora-atomic.git/treefile.json"
bwrap: ../sysdeps/nptl/fork.c:156: __libc_fork: Assertion `THREAD_GETMEM (self, tid) != ppid' failed.
error: bwrap test failed, see https://github.com/projectatomic/rpm-ostree/pull/429: Executing bwrap(true): Child process exited with code 134

I'll take a stab at patching fedmsg-atomic-composer to see if that works better.

Okay, resumed the push with the following patch, it's running now so fingers crossed.


diff --git a/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py.orig b/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py
index dd12fad..396ea9f 100644
--- a/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py.orig
+++ b/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py
@@ -121,6 +121,7 @@ class AtomicComposer(object):
fmt = '{mock_cmd}'
if kwargs.get('new_chroot') is True:
fmt +=' --new-chroot'
+ fmt +=' --capability=CAP_NET_ADMIN'
fmt += ' --configdir={mock_dir}'
return self.call(fmt.format(**release).split()
+ list(cmd))

failed again. So I probably got the patch wrong.

Yeah, that patch isn't going to work; it's providing the argument to mock, not systemd. From a quick glance at the mock source, it doesn't allow configuring the provided arguments to nspawn at all. Which is understandable, because mock is mostly focused on RPMs building, not being an arbitrary container tool.

I filed https://github.com/projectatomic/rpm-ostree/issues/591 for something we can do in rpm-ostree but it's not going to happen immediately.

The only quick hack I can think of is to replace /usr/bin/systemd-nspawn with a wrapper like:

1
2
3
4
5
6
7
#!/usr/bin/bash
set -euo pipefail
if test -n "${FEDMSG_ATOMIC_COMPOSER:-}"; then
  exec systemd-nspawn.real --capability=CAP_NET_ADMIN $@
else
  exec systemd-nspawn.real $@
fi

And mv /usr/bin/systemd-nspawn{,.real}.

And file a request in mock to get the ability to influence the nspawn arguments.

(And change the composer to set that environment variable)

Looks like mock might have a option for this...

rpmbuild_networking=True

which ends up passing to systemd-nspawn '--private-network' which the systemd-nspawn man page says:

       --private-network
           Disconnect networking of the container from the host. This makes all network interfaces unavailable in the
           container, with the exception of the loopback device and those specified with --network-interface= and
           configured with --network-veth. If this option is specified, the CAP_NET_ADMIN capability will be added to
           the set of capabilities the container retains. The latter may be disabled by using --drop-capability=.

Will that work?

I tried the above wrapper, but I guess $FEDMSG_ATOMIC_COMPOSER didn't influence the wrapper because I didn't observe the --capability cmdline argument.

f25-updates-testing-170126.1527

So I tried to get mock to run with rpmbuild_networking=True
But that feature either doesn't work, or I'm making a mistake.
After reading the mock sources I see that option only works in the mock cfg, so I updated fedmsg_atomic_compose template mako file, and that propagated down to mock, but mock still doesn't seem to honour the configuration parameter. So I'm back to writing a wrapper for systemd-nspawn so it re-writes itself with --private-network

Here is what happens now:

ERROR: Command failed. See logs for output.
 # /usr/bin/systemd-nspawn -q -M 1165649d038e4cb0a60c1a70932a87c3 -D /var/lib/mock/fedora-25-updates-testing-x86_64/root --setenv=HOME=/builddir --setenv=HOSTNAME=mock --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=SHELL=/bin/bash --setenv=LANG=en_US.UTF-8 --setenv=TERM=vt100 --setenv=PROMPT_COMMAND=printf "\033]0 # /usr/bin/systemd-nspawn -q -M 1165649d038e4cb0a60c1a70932a87c3 -D /var/lib/mock/fedora-25-updates-testing-x86_64/root --setenv=HOME=/builddir --setenv=HOSTNAME=mock --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=SHELL=/bin/bash --setenv=LANG=en_US.UTF-8 --setenv=TERM=vt100 --setenv=PROMPT_COMMAND=printf "\033]0;<mock-chroot>\007" --setenv=PS1=<mock-chroot> \s-\v\$  /usr/sbin/groupadd -g 135 mockbuild

f25-updates-testing-170126.1950

We don't want nspawn --private-network - then rpm-ostree would be unable to fetch packages via HTTP. Maybe it's simplest to skip the environment variable and just do:

1
2
3
#!/usr/bin/bash
set -euo pipefail
exec systemd-nspawn.real --capability=CAP_NET_ADMIN $@

There's a PR for that issue open now, if you want to try using it. You'll need to update the config to use:

config_opts['nspawn_args'] = ['--capability=CAP_NET_ADMIN']

Patch to Mock merged. Feel free to use patched mock from Copr https://copr.fedorainfracloud.org/coprs/g/mock/mock/

So I installed the above COPR version of mock, and it succeeded in passing the configured 'nspawn_args' over to systemd-nspawn, but that didn't fix the issue overall. to be clear here I had nspawn_args = ['--private-network'].

I've attached the latest log file showing what went wrong.

f25-updates-testing-170127.1749

Also, I was asked to remove the copr mock build because production infra. But I can say it looked good, and we should expedite that build in Fedora.

With the rpm-ostree-2017.1-3.fc25 in the updates-testing push it fails with:

DEBUG util.py:518: Executing command: ['/usr/bin/systemd-nspawn', '-q', '-M', '899becb0830b438d9
40478ea9829de97', '-D', '/var/lib/mock/fedora-25-updates-testing-x86_64/root', '--setenv=PROMPT_C
OMMAND=printf "\033]0;<mock-chroot>\007"', '--setenv=PS1=<mock-chroot> \s-\v\$ ', '--setenv=
HOSTNAME=mock', '--setenv=TERM=vt100', '--setenv=SHELL=/bin/bash', '--setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin', '--setenv=HOME=/builddir', '--setenv=LANG=en_US.UTF-8', '/bin/sh', '-c', '/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpDug1E_/fedora-atomic.git/treefile.json'] with env {'PROMPT_COMMAND': 'printf "\033]0;<mock-chroot>\007"', 'PS1': '<mock-chroot> \s-\v\$ ', 'HOSTNAME': 'mock', 'TERM': 'vt100', 'SHELL': '/bin/bash', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin', 'HOME': '/builddir', 'LANG': 'en_US.UTF-8
'} and shell False
DEBUG util.py:292: Unsharing. Flags: 134217728
DEBUG util.py:435: bwrap: ../sysdeps/nptl/fork.c:156: __libc_fork: Assertion `THREAD_GETMEM (self, tid) != ppid' failed.
DEBUG util.py:435: error: bwrap test failed, see https://github.com/projectatomic/rpm-ostre/pull/429: Executing bwrap(true): Child process exited with code 134
DEBUG util.py:573: Child return code was: 1

Regrettably we had to roll back to rpm-ostree-2016.3-1.fc25, and that allowed the f25-updates-testing push to proceed.

Hm, so that's an assertion inside glibc. The obvious potential culprit here is that bwrap is calling sys_clone behind glibc's back. But, it feels like this must either be a race condition, or a change in glibc, since I'm simply not reproducing it; scenario is

$ rpm -q glibc
glibc-2.24-4.fc25.x86_64
$ for x in $(seq 100); do bwrap --unshare-all --ro-bind / / sleep 1h & done

Gives me 100 containers every time. Does it reliably fail that way? Can you try running that for loop in the root? And can you get the updated root.log? Is there anything in there that's not in current updates-testing?

@walters, I think you should be able to reproduce this issue if you trigger PID reuse. After the clone call, glibc's cached view of the PID does not match the kernel PID. The assertion triggers if the original (pre-clone) PID is reused for the new child process.

The assertion is correct in the sense that glibc does not work properly if the PID cache is out of sync. In rawhide, this is fixed with the removal of the PID cache.

Okay, so I found the commit you're talking about in glibc. I filed https://github.com/projectatomic/bubblewrap/issues/171

BTW @msuchy We installed the new version of mock in production, and it works great! We were able to pass nspawn arguments from rpm-ostree down through mock to systemd-nspawn. This was a trouble shooting step, and we appreciate the fast help from you. Thanks.

@walters Is there anything else we can do to help solve this one?

How reproducible was the glibc assertion? I added what I believe should be a torture test to reproduce it here: https://github.com/projectatomic/bubblewrap/issues/171#issuecomment-277347685

No luck triggering the assertion for me either directly on my host workstation, in a pet Docker container, nor in a fresh nspawn root.

Is it possible to try running that in a root like fedmsg-atomic-composer is doing it?

Maybe we can try re-adding rpm-ostree over the weekend and retry a push?

Kevin has put in a hotfix for this on the backend machines. Peter is going to run an update today without the new rpm-ostree in order to make sure the hotfix alone doesn't break things. If the update today is successful we'll submit the new rpm-ostree for update tomorrow and test that the new rpm-ostree runs through the update process just fine. If that succeeds we'll get the change back into upstream fedora-atomic-composer script.

And this is now currently waiting for "test that the new rpm-ostree runs through the update process just fine"?

right. the mash is still running

ok so it failed, but I think the failure is due to the changes patrick and I made for ticket 6545. Full log here

Can we get someone to update the owner/permissions on the /srv/fedora-atomic/25/x86_64/docker-host/refs/heads/fedora-atomic/25/x86_64/updates/ directory to match the owner/permissions on the /srv/fedora-atomic/25/x86_64/docker-host/refs/heads/fedora-atomic/25/x86_64/testing directory?

Fixed those permissions.
And now we're back to:

Feb 08 07:48:49 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 07:48:49,945 -  INFO - composer.py:168 - Wrote repo configuration to /var/tmp/tmp1J_F_R/fedora-atomic.git/updates-testing.repo
Feb 08 07:48:49 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 07:48:49,945 -  INFO - composer.py:231 - Running ['/usr/bin/mock', '--new-chroot', '-r', 'fedora-25-updates-testing-x86_64', '--new-chroot', '--configdir=/var/tmp/tmp1J_F_R/mock', '--chroot', '/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmp1J_F_R/fedora-atomic.git/treefile.json']
Feb 08 07:48:50 bodhi-backend01.phx2.fedoraproject.org userhelper[58889]: running '/usr/libexec/mock/mock --new-chroot -r fedora-25-updates-testing-x86_64 --new-chroot --configdir=/var/tmp/tmp1J_F_R/mock --chroot /usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmp1J_F_R/fedora-atomic.git/treefile.json' with root privileges on behalf of 'apache'
Feb 08 07:48:51 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 07:48:51,076 -  INFO - composer.py:236 - Can't create loopback device
Feb 08 07:48:51 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: error: bwrap test failed, see <https://github.com/projectatomic/rpm-ostree/pull/429>: Executing bwrap(true): Child process exited with code 1

That looks like the nspawn arguments change got lost? Are the mock logs available?

config_opts['nspawn_args'] = ['--as-pid2']

needs to be:

config_opts['nspawn_args'] = ['--capability=CAP_NET_ADMIN', '--as-pid2']

ok. I have changed the options to that and @pbrobinson is going to resume. Fingers crossed.

---capability=CAP_NET_ADMIN --as-pid2 is the triple dash on ---capability the cause?

Yes, it should be just two. (I typoed it originally but edited my comment)

ok so it got farther this time, but still failed.

The highlevel log is here and the journal output that shows more detail is below:

Feb 08 18:20:20 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:20,495 -  INFO - composer.py:168 - Wrote repo configuration to /var/tmp/tmpFrpXJE/fedora-atomic.git/updates-testing.repo
Feb 08 18:20:20 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:20,495 -  INFO - composer.py:231 - Running ['/usr/bin/mock', '--new-chroot', '-r', 'fedora-25-updates-testing-x86_64', '--new-chroot', '--configdir=/var/tmp/tmpFrpXJE/mock', '--chroot', '/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json']
Feb 08 18:20:20 bodhi-backend01.phx2.fedoraproject.org userhelper[73766]: running '/usr/libexec/mock/mock --new-chroot -r fedora-25-updates-testing-x86_64 --new-chroot --configdir=/var/tmp/tmpFrpXJE/mock --chroot /usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json' with root privileges on behalf of 'apache'
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:22,371 -  INFO - composer.py:236 - Previous commit: d58ceb16bca578360b5f01fc094005c28fae0b584f439d1be77668c665becdb7
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Downloading metadata: 33%
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Downloading metadata: 66%
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Downloading metadata: 100%
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: warning: umount failed: Device or resource busy
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: error: No package matches 'NetworkManager'
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:22,372 -  ERROR - composer.py:238 - INFO: mock.py version 1.3.3 starting (python version = 3.5.2)...
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Start: init plugins
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: selinux disabled
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Finish: init plugins
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Start: run
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Start: chroot init
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: calling preinit hooks
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: enabled root cache
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: enabled yum cache
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Start: cleaning yum metadata
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Finish: cleaning yum metadata
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: enabled HW Info plugin
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Mock Version: 1.3.3
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: Mock Version: 1.3.3
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Finish: chroot init
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: INFO: Running in chroot: ['/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json']
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Start: chroot ['/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json']
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Finish: chroot ['/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json']
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: ERROR: Command failed. See logs for output.
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:  # /usr/bin/systemd-nspawn -q -M b3c5bcc626a447e6b937bf4176dd13e2 -D /var/lib/mock/fedora-25-updates-testing-x86_64/root --capability=CAP_NET_ADMIN --as-pid2 --setenv=PROMPT_COMMAND=printf "\033]0;<mock-chroot>\007" --setenv=HOSTNAME=mock --setenv=PS1=<mock-chroot> \s-\v\$  --setenv=TERM=vt100 --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=LANG=en_US.UTF-8 --setenv=SHELL=/bin/bash --setenv=HOME=/builddir /bin/sh -c /usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/fedora-atomic/25/x86_64/docker-host /var/tmp/tmpFrpXJE/fedora-atomic.git/treefile.json
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:22,372 -  ERROR - composer.py:240 - returncode = 1
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: 2017-02-08 18:20:22,372 -  ERROR - composer.py:58 - Compose failed
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Traceback (most recent call last):
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:   File "/usr/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py", line 48, in compose
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:     ref, commitid = self.ostree_compose(release)
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:   File "/usr/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py", line 188, in ostree_compose
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:     out, err, rcode = self.mock_chroot(release, cmd, new_chroot=True)
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:   File "/usr/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py", line 158, in mock_chroot
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:     return self.mock_cmd(release, '--chroot', cmd, **kwargs)
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:   File "/usr/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py", line 126, in mock_cmd
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:     + list(cmd))
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:   File "/usr/lib/python2.7/site-packages/fedmsg_atomic_composer/composer.py", line 241, in call
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]:     raise Exception
Feb 08 18:20:22 bodhi-backend01.phx2.fedoraproject.org fedmsg-hub[37375]: Exception

I think something went wrong with the rpm-md repos here. Does file:///pub/fedora/linux/updates/25/x86_64/ exist on this node and look correct?

I think something went wrong with the rpm-md repos here. Does file:///pub/fedora/linux/updates/25/x86_64/ exist on this node and look correct?

right, specifically what makes us suspect this is this line from the journal:

fedmsg-hub[37375]: error: No package matches 'NetworkManager'

so i'm trying really hard to recreate this locally but not having much luck:

i'm using the new version of mock that passes along nspawn args and i'm using a mock config like so:

config_opts['root'] = 'fedora-25-updates-testing-x86_64'
config_opts['target_arch'] = 'x86_64'
config_opts['dist'] = 'f25'  # only useful for --resultdir variable subst
config_opts['releasever'] = '25'
config_opts['chroot_setup_cmd'] = 'install yum rpm-ostree'
config_opts['extra_chroot_dirs'] = ['/run/lock']
config_opts['plugin_conf']['bind_mount_enable'] = True

config_opts['nspawn_args'] = ['--capability=CAP_NET_ADMIN', '--as-pid2']
config_opts['plugin_conf']['bind_mount_opts']['dirs'].append(('/srv/local/', '/srv/local/'))
config_opts['plugin_conf']['bind_mount_opts']['dirs'].append(('/srv/foobar/', '/srv/foobar/'))
config_opts['plugin_conf']['bind_mount_opts']['dirs'].append(('/srv/cache/', '/srv/cache/'))
config_opts['plugin_conf']['bind_mount_opts']['dirs'].append(('/srv/ostree/', '/srv/ostree/'))

config_opts['yum.conf'] = """
[main]
cachedir=/var/cache/yum
debuglevel=2
reposdir=/dev/null
logfile=/var/log/yum.log
retries=20
obsoletes=1
gpgcheck=0
assumeyes=1
metadata_expire=0

[local]
name=local
baseurl=file:///srv/local
enabled=1
cost=1

[updates-testing]
name=Fedora 25 updates-testing
baseurl=http://download.fedoraproject.org/pub/fedora/linux/updates/testing/$releasever/$basearch/
enabled=1
cost=5000

[fedora-25-updates]
name=Fedora 25 fedora-25-updates
baseurl=http://download.fedoraproject.org/pub/fedora/linux/updates/25/x86_64/
enabled=1
cost=5000

[fedora-25]
name=Fedora 25 fedora-25
baseurl=http://download.fedoraproject.org/pub/fedora/linux/releases/25/Everything/x86_64/os/
enabled=1
cost=5000
"""
config_opts['yum_common_opts'] = []

and I end up having no issues:

[vagrant@rpm-ostree ~]$ sudo mock --new-chroot -r fedora-25-updates-testing-x86_64 --new-chroot --configdir=./ --chroot '/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/ostree /srv/foobar/fedora-atomic-docker-host.json'    
WARNING: Could not find required logging config file: ./logging.ini. Using default...
INFO: mock.py version 1.3.3 starting (python version = 3.5.2)...
Start: init plugins
INFO: selinux disabled
Finish: init plugins
Start: run
Start: chroot init
INFO: calling preinit hooks
INFO: enabled root cache
INFO: enabled yum cache
Start: cleaning yum metadata
Finish: cleaning yum metadata
INFO: enabled HW Info plugin
Mock Version: 1.3.3
INFO: Mock Version: 1.3.3
Finish: chroot init
INFO: Running in chroot: ['/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/ostree /srv/foobar/fedora-atomic-docker-host.json']
Start: chroot ['/usr/bin/rpm-ostree compose tree --workdir-tmpfs --repo=/srv/ostree /srv/foobar/fedora-atomic-docker-host.json']
No previous commit for fedora-atomic/25/x86_64/docker-host
Downloading metadata: 1%
Downloading metadata: 25%
Downloading metadata: 26%
Downloading metadata: 50%
Downloading metadata: 51%
Downloading metadata: 75%
Downloading metadata: 76%
Downloading metadata: 100%
Resolving dependencies... done
Transaction: 444 packages
  GeoIP-1.6.9-2.fc24.x86_64 (fedora-25)
  GeoIP-GeoLite-data-2017.01-1.fc25.noarch (fedora-25-updates)
  NetworkManager-1:1.4.4-3.fc25.x86_64 (fedora-25-updates)

it even ends up getting one rpm from a "locally bind mounted" repo:

  rpm-ostree-2017.1-4.fc25.x86_64 (local)

anyway, looks like we need to get access to the machine in order to really figure out what is going on. maybe we can schedule some time to video conference and investigate.

I still suspect something is wrong with the repodata; I hacked up https://github.com/projectatomic/rpm-ostree/pull/613 which may help a bit.

repodata seems fine, supposedly this is the repo from the mash: https://kojipkgs.fedoraproject.org/mash/updates/f25-updates-testing-170208.0154/f25-updates-testing/x86_64/. using that repo I'm able to get a successful compose with the new updated rpms (new kernel, new rpm-ostree) in it.

we unpushed rpm-ostree and the update went through fine. so this is clearly still an issue that shows itself when trying to use the new rpm-ostree. I can't recreate the problem locally so Kevin is going to work on setting up a machine in the same environment that we can use to try to recreate the problem.

OK, I was able to reproduce this. Believe it or not, this was the issue (using rpm-ostree git master here, which prints some additional details about repos):

Downloading metadata: [==========================================================================================] 100%
rpm-md repository versions:                                     
  updates-testing (not enabled in treefile)                                                                            
  fedora-25-updates (not enabled in treefile)                           
  fedora-25 (not enabled in treefile) 
...

A simple sed -i 's/enabled=0/enabled=1/' updates-testing worked around that. I think this is probably a regression from https://github.com/rpm-software-management/libdnf/pull/231, see https://github.com/rpm-software-management/libdnf/issues/250.

I posted a patch to fedmsg-atomic-composer we could use to work around this:

https://github.com/fedora-infra/fedmsg-atomic-composer/pull/15

I can look into fixing the libdnf regression tomorrow, unless you beat me to it.

ok - kevin put the hotfix from https://github.com/fedora-infra/fedmsg-atomic-composer/pull/15 into place and the push went through fine. We need to do some follow up on this:

1 - we need to merge https://github.com/fedora-infra/fedmsg-atomic-composer/pull/15
- action item @releng to review

2 - we need a released version of mock that passes along nspawn args
- action item @msuchy to put out new version of mock

3 - we need --as-pid2 and --capability=CAP_NET_ADMIN added to template mock config in https://github.com/fedora-infra/fedmsg-atomic-composer/blob/develop/fedmsg_atomic_composer/templates/mock.mako. @walters can you please give me a short description for why each of these changes is needed?
- action item @walters to give description of why changes are needed
- action item @dustymabe to open PR with changes and meaningful commit message

We need --as-pid2 to work around a glibc assertion. We need CAP_NET_ADMIN so rpm-ostree can turn off networking when running RPM scripts/dracut etc.

FYI I plan to release new mock around 2017-02-28 (F26 branching), hopefully --as-pid2 will be resolved by then too.

FYI I plan to release new mock around 2017-02-28 (F26 branching), hopefully --as-pid2 will be resolved by then too.

Thanks! I'll wait to open my PR until we know if mock with have the --as-pid2 fix in it or not.

This can be closed, correct?

Not yet. @msuchy just put out a new version of mock. and @bowlofeggs merged that one PR. We need one final PR to add in --capability=CAP_NET_ADMIN to https://github.com/fedora-infra/fedmsg-atomic-composer/blob/develop/fedmsg_atomic_composer/templates/mock.mako . I'll open that soon.

PR was merged. I'm going to close this issue now.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

7 years ago

Login to comment on this ticket.

Metadata