Learn more about these different git repos.
Other Git URLs
We have been hitting issues unpacking src.rpm's of late in the Fedora koji instance.
This happens on our s390x builders, possibly when they are under load or the remote link to them is slow/busy.
See:
https://pagure.io/fedora-infrastructure/issue/7753 https://pagure.io/fedora-infrastructure/issue/7581 https://pagure.io/fedora-infrastructure/issue/7729
For example, from https://koji.fedoraproject.org/koji/taskinfo?taskID=34567975 https://kojipkgs.fedoraproject.org//work/tasks/8008/34568008/root.log
DEBUG util.py:677: Executing command: ['/bin/rpm', '-Uvh', '--nodeps', '/builddir/build/originals/eclipse-4.11-7.module_f28+4131+bc1a8827.src.rpm'] with env {'TERM': 'vt100', 'SHELL': '/bin/bash', 'HOME': '/builddir', 'HOSTNAME': 'mock', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin', 'PROMPT_COMMAND': 'printf "\033]0;<mock-chroot>\007"', 'PS1': '<mock-chroot> \s-\v\$ ', 'LANG': 'C.UTF-8'} and shell False DEBUG util.py:418: Unsharing. Flags: 134217728 DEBUG util.py:585: Updating / installing... DEBUG util.py:585: eclipse-1:4.11-7.module_f28+4131+bc1a8######################################## DEBUG util.py:583: BUILDSTDERR: error: unpacking of archive failed on file /builddir/build/SOURCES/eclipse-platform-sources-4.11RC2.tar.xz;5cca4e57: cpio: read failed - Inappropriate ioctl for device DEBUG util.py:583: BUILDSTDERR: error: /builddir/build/originals/eclipse-4.11-7.module_f28+4131+bc1a8827.src.rpm cannot be installed
I am not sure what could be done here, but it seems like the src.rpm it has is somehow corrupt as downloaded? The src.rpm in koji unpacks fine.
We are using a http proxy for packages, but does this src.rpm download happen via xmlrpc? Could perhaps the hub do a checksum of it and have a client check it after download and retry if it didn't match?
Happy to gather more info.
EDITED TO ADD: Looks like it also happens on armv7 builders: https://koji.fedoraproject.org/koji/taskinfo?taskID=34573650 https://pagure.io/fedora-infrastructure/issue/7581
It's interesting to me that all of these reports are for failures to install the source package. I believe koji (or maybe it's mock at this point) could easily verify the package after downloading it and retry at that point before attempting to install it. It's not obvious how to do this, but
rpm2cpio foo.src.rpm | cpio -i --only-verify-crc
will at least fail on truncated archives (with cpio: premature end of file and a return code of 2). Of course, just looking at the file size might work just as well.
cpio: premature end of file
does this src.rpm download happen via xmlrpc?
No, just a regular httpd GET.
Metadata Update from @yulwang: - Issue assigned to dgregor - Issue tagged with: discussion, feature, groomed
@kevin @tibbs how confident are we that a retry will work? If the corruption is due to server or network load, retrying may make it worse. Is this something that has happened in the past as well or just recently?
Is retrying this one download more load that retrying the whole build? Because at the moment all that happens is I am forced to just resubmit my whole build.
Certainly if a retry doesn't work then we would have learned something useful. Currently I'm not sure that we have any reasonable way to know if the problem is in the download or if the SRPM simply wasn't written correctly at some earlier stage of the process, though I think evidence suggests the former since the SRPMs are rarely built in s390x machines while the problem is only seen there. If necessary, the check I posted earlier could be added at other places in the process to verify that the SRPM is OK at each step.
In the examples given above, the unpacking of the SRPM fails on s390x, but succeeds on all other arches. The fact that some arches succeed tells you that the SRPM itself is ok, no?
Another occurrance seen today:
https://koji.fedoraproject.org/koji/taskinfo?taskID=35592274
DEBUG util.py:587: Updating / installing... DEBUG util.py:587: protobuf-3.6.1-6.module_f30+4730+ebb8f######################################## DEBUG util.py:585: BUILDSTDERR: error: unpacking of archive failed on file /builddir/build/SOURCES/googletest-1.8.1.tar.gz;5d0773c9: cpio: read failed - Inappropriate ioctl for device DEBUG util.py:585: BUILDSTDERR: error: /builddir/build/originals/protobuf-3.6.1-6.module_f30+4730+ebb8f206.src.rpm cannot be installed DEBUG util.py:734: Child return code was: 1
This still seems to be happening. Do we know if it's only Z/VM s390x builder (01-14) or kvm ones (15-24)?
@sharkcz any ideas on how we might track this down?
Could this be some odd rpm or cpio bug on s390x?
Metadata Update from @kevin: - Custom field Size adjusted to None
FWIW, I saw two more occurances recently:
https://koji.fedoraproject.org/koji/taskinfo?taskID=37763977 https://koji.fedoraproject.org/koji/taskinfo?taskID=37763055
The relevant part of the logs is from mock_output.log:
Start: build phase for eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm Start: build setup for eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm Finish: build setup for eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm Finish: build phase for eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm ERROR: Exception(/var/tmp/koji/tasks/3055/37763055/local/work/tasks/2985/37762985/eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm) Config(module-eclipse-latest-3120190920120950-d36a98d4-build-17552893-1266828) 0 minutes 0 seconds INFO: Results and/or logs in: /var/lib/mock/module-eclipse-latest-3120190920120950-d36a98d4-build-17552893-1266828/result ERROR: Command failed: # /bin/rpm -Uvh --nodeps /builddir/build/originals/eclipse-4.13-2.module_f31+6519+12cd0b27.src.rpm
So from @mbooth's last report and my earlier observation it happens on both kvm and z/vm guests ...
@kevin, I don't recall seeing such problem when doing local builds, but can try again. The only known fact is that it affects large srpms ...
[sharkcz@devel3 firefox]$ truncate -s 340M firefox-69.0.1-3.fc32.src.rpm [sharkcz@devel3 firefox]$ rpm -ivh firefox-69.0.1-3.fc32.src.rpm Updating / installing... 1:firefox-69.0.1-3.fc32 ################################# [100%] error: unpacking of archive failed on file /home/sharkcz/rpmbuild/SOURCES/firefox-langpacks-69.0.1-20190918.tar.xz;5d88c18b: cpio: read failed - No such file or directory error: firefox-69.0.1-3.fc32.src.rpm cannot be installed
symptom is very similar, but the error from cpio is different
cpio
But when running mock rebuild (using nspawn chroot) against that shortened srpm, I get the same error
DEBUG util.py:593: warning: group sharkcz does not exist - using root DEBUG util.py:595: ##### DEBUG util.py:593: error: unpacking of archive failed on file /builddir/build/SOURCES/firefox-langpacks-69.0.1-20190918.tar.xz;5d88c45f: cpio: read failed - Inappropriate ioctl for device DEBUG util.py:593: error: /builddir/build/originals/firefox-69.0.1-3.fc32.src.rpm cannot be installed DEBUG util.py:741: Child return code was: 1 DEBUG util.py:327: kill orphans
So I still suspect some infra issue causing incomplete download of the srpm.
Possibly, but why is it only src.rpm downloads? Are those done via xmlrpc?
IMHO the best (maybe only) thing we can is to add/enable some debugging/verification in the koji builder code, to capture more context before the unpacking failure (before even starting mock).
According to Mike's comment earlier in the thread, the downloads are just a regular httpd GET. Would it be possible to look at the web server logs to see if the byte size sent matches what's expected? Also, this is only happening on s390x, correct?
should be on armv7 too, but s390x is the most frequent one, but it's also the most distant (if there is a correllation)
So, the s390x builders all point to a local varnish cache, which in turn pulls from kojipkgs01/02 (the same servers used by other builders).
There's nothing I can see off in the kojipkgs logs.
So, perhaps the issue is in varnish...
I would start with something like this
diff --git a/builder/kojid b/builder/kojid index d96a8ae5..7cba764b 100755 --- a/builder/kojid +++ b/builder/kojid @@ -1335,6 +1335,8 @@ class BuildArchTask(BaseBuildTask): fn = self.localPath("work/%s" % pkg) if not os.path.exists(fn): raise koji.BuildError("SRPM file missing: %s" % fn) + if not rpm_digest_ok(fn): + raise koji.BuildError("SRPM file corrupted: %s" % fn) # peel E:N-V-R from package h = koji.get_rpm_header(fn) name = koji.get_header_field(h, 'name')
and it could serve as a generally useful check, not specific for debugging this particular issue.
If truncating a file causes this, then I think there's also a defect in the error handling in cpio, because "Inappropriate ioctl for device" does not say "corrupted file" to me.
Inappropriate ioctl for device is when mock with nspawn is in between, No such file or directory is an error in "normal" case
Inappropriate ioctl for device
No such file or directory
I've observed another error in the same vein:
https://koji.fedoraproject.org/koji/taskinfo?taskID=37848374
It has the slightly different error of:
Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/koji/daemon.py", line 1295, in runTask response = (handler.run(),) File "/usr/lib/python3.7/site-packages/koji/tasks.py", line 311, in run return koji.util.call_with_argcheck(self.handler, self.params, self.opts) File "/usr/lib/python3.7/site-packages/koji/util.py", line 263, in call_with_argcheck return func(*args, **kwargs) File "/usr/sbin/kojid", line 1291, in handler h = koji.get_rpm_header(fn) File "/usr/lib/python3.7/site-packages/koji/__init__.py", line 887, in get_rpm_header hdr = ts.hdrFromFdno(fo.fileno()) File "/usr/lib64/python3.7/site-packages/rpm/transaction.py", line 186, in hdrFromFdno raise rpm.error("error reading package header") _rpm.error: error reading package header
Actually I'm not sure if this one is related, but it happened on a s390x host, so...
https://koji.fedoraproject.org/koji/taskinfo?taskID=37858831
Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/koji/daemon.py", line 1295, in runTask response = (handler.run(),) File "/usr/lib/python3.7/site-packages/koji/tasks.py", line 311, in run return koji.util.call_with_argcheck(self.handler, self.params, self.opts) File "/usr/lib/python3.7/site-packages/koji/util.py", line 263, in call_with_argcheck return func(*args, **kwargs) File "/usr/sbin/kojid", line 971, in handler h = self.readSRPMHeader(srpm) File "/usr/sbin/kojid", line 1056, in readSRPMHeader with koji.openRemoteFile(relpath, **opts) as fo: File "/usr/lib/python3.7/site-packages/koji/__init__.py", line 1613, in openRemoteFile src = six.moves.urllib.request.urlopen(url) File "/usr/lib64/python3.7/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/usr/lib64/python3.7/urllib/request.py", line 531, in open response = meth(req, response) File "/usr/lib64/python3.7/urllib/request.py", line 641, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib64/python3.7/urllib/request.py", line 569, in error return self._call_chain(*args) File "/usr/lib64/python3.7/urllib/request.py", line 503, in _call_chain result = func(*args) File "/usr/lib64/python3.7/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 503: Backend fetch failed
Task failed with
error: unpacking of archive failed on file /builddir/build/SOURCES/linphone-3.6.1.tar.gz;5d923bc7: cpio: read failed - Inappropriate ioctl for device
https://koji.fedoraproject.org/koji/taskinfo?taskID=37969762
@kevin Can you see something in varnish logs?
Another occurance:
https://koji.fedoraproject.org/koji/taskinfo?taskID=37981909
DEBUG util.py:595: Updating / installing... DEBUG util.py:595: eclipse-cdt-2:9.9.0-1.module_f31+6676+######################################## DEBUG util.py:593: error: unpacking of archive failed on file /builddir/build/SOURCES/org.eclipse.cdt-7e4f3a84db67f87399a41a35ba11ac591ab9117e.tar.xz;5d934a47: cpio: read failed - Inappropriate ioctl for device DEBUG util.py:593: error: /builddir/build/originals/eclipse-cdt-9.9.0-1.module_f31+6676+830996ad.src.rpm cannot be installed
And another:
https://koji.fedoraproject.org/koji/taskinfo?taskID=39324111
DEBUG util.py:598: Updating / installing... DEBUG util.py:598: protobuf-3.6.1-6.module_f32+7102+8b113######################################## DEBUG util.py:596: error: unpacking of archive failed on file /builddir/build/SOURCES/protobuf-3.6.1-all.tar.gz;5ddbe3b6: cpio: read failed - Inappropriate ioctl for device DEBUG util.py:596: error: /builddir/build/originals/protobuf-3.6.1-6.module_f32+7102+8b1132d5.src.rpm cannot be installed DEBUG util.py:744: Child return code was: 1 DEBUG util.py:329: kill orphans DEBUG util.py:763: child environment: None
Is anyone looking at this? I'm happy to keep reporting occurrences when they happen.
On our side, we've added #290 to 1.20 release roadmap. It should at least find some problems, but it is probably only a subset of this issue. For fixing this, I would work more with fedora infra (@kevin ?)t, as it seems not to be related to koji itself, but to some infra setup.
@tkopecek, how about https://pagure.io/koji/issue/1418#comment-599893 ? It should be pretty straightforward.
Metadata Update from @tkopecek: - Issue set to the milestone: 1.20
Metadata Update from @julian8628: - Issue marked as depending on: #290
Metadata Update from @dgregor: - Issue assigned to tkopecek (was: dgregor)
This should be fixed with #290
Metadata Update from @tkopecek: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Metadata Update from @jcupova: - Issue tagged with: testing-done
Hey all, I saw this again today, but I don't have permission to reopen the ticket so I filed a new one at #1974
Login to comment on this ticket.