Learn more about these different git repos.
Other Git URLs
We keep seeing kerberos failures when running our nightly pungi runs in f26. It doesn't happen all the time but has happened a couple of times in f26 so far. The log from 03/20 ostree tree compose shows the error:
COMMAND: koji --profile=compose_koji runroot --new-chroot --use-shell --task-id --channel-override=compose --package=pungi --package=ostree --package=rpm-ostree --mount=/mnt/koji/compose/branched/Fedora-26-20170320.n.0 --mount=/mnt/koji/compose/ostree/26/ f26-build x86_64 'rm -f /var/lib/rpm/__db*; rm -rf /var/cache/yum/*; set -x; pungi-make-ostree tree --repo=/mnt/koji/compose/ostree/26/ --log-dir=/mnt/koji/compose/branched/Fedora-26-20170320.n.0/logs/x86_64/ostree/ostree-3 --treefile=/mnt/koji/compose/branched/Fedora-26-20170320.n.0/work/ostree-3/config_repo/fedora-ostree-workstation.json --extra-config=/mnt/koji/compose/branched/Fedora-26-20170320.n.0/work/ostree-3/extra_config.json' ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Kerberos authentication failed: Internal credentials cache error (-1765328188)
@parasense suggested that it's possible the pungi compose took longer than the krb5 ticket lifespan. This run started at 2017-03-20 07:17:28 and the failure happened around 2017-03-20 12:25:57.
2017-03-20 07:17:28
2017-03-20 12:25:57
@parasense thinks "we might want keytabs" for this?
cc @puiterwijk, @kevin
The compose_koji profile is using a keytab.
compose_koji
happens often for atomic workstation as well:
https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170329.n.0/logs/x86_64/ostree/ostree-3/runroot.log
@lsedlar is it possible that pungi is not calling with the correct profile?
@dustymabe: that profile 1. does use keytabs, and 2. the ticket is valid for 24 hours. Which means that within those 5 hours, there definitely would not be a ticket timeout. Regardless of the fact that the koji client takes care of renewing it, if it has the keytab.
I'm reasonably sure that this might be the bug where pungi does not always use the correct profile for some operations, or something else where it doesn't pass everything needed to the koji client.
I'm not sure what the cause is, was just trying to report the issue. Please assign this bug to the appropriate party that should investigate.
While it's definitely possible there is some bug in Pungi, in this case the log shows that the command did use --profile=compose_koji, which looks correct to me.
--profile=compose_koji
@puiterwijk Could it be a race condition when multiple koji commands are invoked in parallel?
another one: https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170423.n.0/logs/x86_64/Atomic/ostree-2/runroot.log
I think it indeed is a race condition. I managed to replicate it with for x in $(seq 1 100) ; do sudo koji -p compose_koji hello & done >/dev/null. Every now and the some of the commands fail and print the error.
for x in $(seq 1 100) ; do sudo koji -p compose_koji hello & done >/dev/null
This should be fixable on Pungi side by setting KRB5CCNAME env var to an fresh directory (but only if keytab is used for authentication).
KRB5CCNAME
Right. After looking further, this is because when you provide a keytab, you bypass the GSSAPI code paths (that path doesn't support keytabs).
The krbv codepath does a new init_creds_keytab every single time, which gets a new credential everytime, regardless of whether or not one is already on the credential cache. As a result, when two koji instances at the same time perform krb_login with keytabs, one will erase the credentials the other one has gotten, while the other tries to use that credential to log in.
This should fix it from Pungi side: https://pagure.io/pungi/pull-request/607 The reason why we only ever see this ostree tasks is that in all other phases that start koji commands in parallel there already is a protection (of sorts) against this: there are sleeps so the commands don't actually run at the same time.
This keeps happening until we get the new version of pungi deployed. Error from last night's run:
https://kojipkgs.fedoraproject.org/compose/branched/Fedora-26-20170513.n.0/logs/x86_64/Atomic/ostree-2/runroot.log
I deployed the new pungi on branched-composer last night.
It still failed, but it's a new error now, so probibly we can close this issue.
... DEBUG util.py:439: ERROR running command: rpm-ostree compose tree --repo=/mnt/koji/compose/atomic/26/ --write-commitid-to=/mnt/koji/compose/branched/Fedora-26-20170514.n.0/logs/x86_64/Atomic/ostree-2/commitid.log /mnt/koji/compose/branched/Fedora-26-20170514.n.0/work/ostree-2/config_repo/fedora-atomic-docker-host.json DEBUG util.py:439: COMMAND: rpm-ostree compose tree --repo=/mnt/koji/compose/atomic/26/ --write-commitid-to=/mnt/koji/compose/branched/Fedora-26-20170514.n.0/logs/x86_64/Atomic/ostree-2/commitid.log /mnt/koji/compose/branched/Fedora-26-20170514.n.0/work/ostree-2/config_repo/fedora-atomic-docker-host.json DEBUG util.py:439: ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- DEBUG util.py:439: Previous commit: 347a653bc1b2d81cd807e0d956ac3a43411e773715064a6ef93a2184099a04c6 DEBUG util.py:439: error: cannot update repo 'source_repo_from-20170514181412': Cannot prepare internal mirrorlist: Cannot resolve path for: "None" DEBUG util.py:439: Traceback (most recent call last): DEBUG util.py:439: File "/usr/bin/pungi-make-ostree", line 15, in <module> DEBUG util.py:439: ostree.main() DEBUG util.py:439: File "/usr/lib/python2.7/site-packages/pungi/ostree/__init__.py", line 89, in main DEBUG util.py:439: func() DEBUG util.py:439: File "/usr/lib/python2.7/site-packages/pungi/ostree/tree.py", line 101, in run DEBUG util.py:439: self._make_tree() DEBUG util.py:439: File "/usr/lib/python2.7/site-packages/pungi/ostree/tree.py", line 46, in _make_tree DEBUG util.py:439: shortcuts.run(cmd, show_cmd=True, stdout=True, logfile=log_file) DEBUG util.py:439: File "/usr/lib/python2.7/site-packages/kobo/shortcuts.py", line 335, in run DEBUG util.py:439: raise RuntimeError(err_msg) DEBUG util.py:439: RuntimeError: ERROR running command: rpm-ostree compose tree --repo=/mnt/koji/compose/atomic/26/ --write-commitid-to=/mnt/koji/compose/branched/Fedora-26-20170514.n.0/logs/x86_64/Atomic/ostree-2/commitid.log /mnt/koji/compose/branched/Fedora-26-20170514.n.0/work/ostree-2/config_repo/fedora-atomic-docker-host.json ...
I believe that is caused by mismatch between pungi version on the composer (4.1.15) and in the buildroot (4.1.13).
yep. kevin and I figured that out last night and he submitted a buildroot overrides for the new version of pungi. We'll see if it works this time.
Also, we should probably go through our pungi configuration and update all the repo_from/source_repo_from to just repo now that those are deprecated.
repo_from
source_repo_from
repo
Metadata Update from @dustymabe: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.