#2222 cli: fix sigmd5 checking
Closed 3 years ago by tkopecek. Opened 3 years ago by tkopecek.
tkopecek/koji issue2221  into  master

file modified
+7 -2
@@ -580,8 +580,13 @@ 

      # payload hash

      sigmd5 = koji.get_header_fields(path, ['sigmd5'])['sigmd5']

      if rpm['payloadhash'] != koji.hex_string(sigmd5):

-         os.unlink(path)

-         error("Downloaded rpm %s doesn't match db, deleting" % path)

+         # older rpms, see add_rpm_sig in hub

+         sighdr = koji.rip_rpm_sighdr(path)

+         rawhdr = koji.RawHeader(sighdr)

+         sigmd5 = rawhdr.get(koji.RPM_SIGTAG_MD5)

+         if rpm['payloadhash'] != koji.hex_string(sigmd5):

+             os.unlink(path)

+             error("Downloaded rpm %s doesn't match db SIGMD5, deleting" % path)

  

  

  def download_archive(build, archive, topurl, quiet=False, noprogress=False):

Metadata Update from @tkopecek:
- Pull-request tagged with: testing-ready

3 years ago

This looks reasonable, but I don't offhand know of a replicator for #2221 to verify that the problem is fixed.

I'm puzzled.

The new fallback code says it is for older rpms and references the add_rpm_sig() function on the hub. However, what the cli was already doing is effectively the "more expensive check" that the hub falls back to for older rpms.

Basically the hub tries to verify things withing invoking rpmlib first (it is at that point working with a detached signature header, not a full rpm), but there are cases with old rpms where Koji's simplistic rpm header parsing give the wrong md5. So when the first check fails it falls back to using rpmlib. In order to do that, it has to piece together a temporary rpm file to feed to rpmlib.

In this code, we have a full rpm file. The initial check is using rpmlib, which is basically the "most expensive" fallback from the hub function. This patch adds a fallback to that to try using with koji's RawHeader instead.

It would be very surprising for rpmlib to report the wrong md5sum for an rpm, and it would be even more surprising for the simple check with koji's RawHeader to do a better job in such cases.

After looking further, I'm really not sure about this and I'd like to see some example rpms that manifest #2221 and are fixed by this change.

After getting the problem to replicate, I don't think this is the right solution.

The problem here is that we use _decode_item() in get_header_field. This comes from the original PY3 work for the hub in #921. It's unclear to me if that is really needed (rpmlib in py3 seems to return strings for the string fields already), but in any case it should not be done for sigmd5 which is binary data. rpmlib seems to understand this as sigmd5 is about the only bytes value in the few rpms I've looked at.

@julian8628 do you recall what problems you were seeing back then that prompted you to write

I found koji.get_rpm_header() will return different result between py2 and py3, because of the differences of the string/byte types.
like the case in PR #1068, should we make the result to only contain strings?

Anyway, I've got a simple change here (#2268) that seems to solve the problem

@julian8628 do you recall what problems you were seeing back then that prompted you to write

it is about #1069
with python2, rpmlib returns values in headers as string, but with python3 it is bytes

Thanks, I wonder if rpmlib's py3 behavior has changed since then. I don't see bytes values there now.

Ah yes. If I try on, say, Fedora 27, the header object from rpmlib reports all bytes values. E.g.

>>> hdr[rpm.RPMTAG_SUMMARY]
b'The common Python RPM macros'

This is no longer the case in Fedora 31. I haven't dug through to see exactly when it changed. I guess the change I have in #2268 is the safest then, in case someone is running with py3 and an older rpm. I'll add a comment in the code about this.

Fixed in #2268, closing this one.

Pull-Request has been closed by tkopecek

3 years ago

Metadata Update from @tkopecek:
- Pull-request untagged with: testing-ready

3 years ago