Since createrepo_c 1.x landed in F39, bodhi tests started to fail. I'm not sure if this is something related to tests configuration or it will be a real problem when we will switch bodhi composer base image to F39.
The issue is that the sanity_check_repodata method cannot handle compressed comps.xml. As I understand, with previous createrepo_c this file was never compressed even if we chose to compress repodata.
comps.xml
I tried to fix this in https://github.com/fedora-infra/bodhi/pull/5455 but I'm getting some segfaults which are being investigated with upstream.
I'd like someone with knowledge on compose creation process to have a look at the code. While fixing, should I also consider other possible compression formats? Is there anything else to adjust for safe createrepo_c v1.x compatibility?
Metadata Update from @humaton: - Issue assigned to humaton
I don't see anything bad at first glance here. Is this still an issue?
Yes, see the bodhi unit tests failing on f39/Rawhide.
Basically, bodhi composer runs a function to verify that the produced repodata is sane. Within this function, the comps.xml file content is verified by just loading the file with libcomps. But with createrepo_c 1.x the file is compressed as well as the other repodata, so the method is currently unable to parse the file because it is xz compressed.
I don't know if this is specific to some settings we use in bodhi tests. My concern is that if this will happen when we migrate the base image to f39, the composes will fail. I tried to address compressed comps.xml parsing in the PR I linked in the first comment, but so far I hit a few segfaults from libcomps. In the PR I also addressed only xz compression, I don't know if for Fedora/EPEL repositories we use some other compression method.
Metadata Update from @phsmoura: - Issue tagged with: medium-gain, medium-trouble, ops
createrepos_c requires bzip2 and xz. Looking at the source there is function to choose compression method.
Metadata created by Bodhi currently defaults to XZ compression, except for all releases with a prefix of FEDORA-EPEL which are switched to BZ2: https://github.com/fedora-infra/bodhi/blob/73268d40b4ddea7196f1165ef008c07c4bccf612/bodhi-server/bodhi/server/metadata.py#L140-L156
FEDORA-EPEL
So, I suppose I need to add support for bz2 in the sanity_check as well. BTW, maybe worth to consider switching EPEL>=8 to XZ? (not sure about the impact of changing compression for a release already deployed)
Another thing I've noticed reading https://www.fedoraproject.org/wiki/Changes/createrepo_c_1.0.0 is: "When adding groups.xml to repodata createrepo_c currently adds two variants to repomd.xml. The specified file as is, uncompressed, with the type "group" and also a compressed variant with type "group_XX", where XX is compression suffix. This is atypical and unexpected. We propose to include just one variant of groups.xml using specified compression and repomd.xml type "group". This is not compatible with yum in RHEL 7. If required users will still be able to create repositories with the old layout using modifyrepo_c." As I understand that, the current way Bodhi creates repositories for EPEL7 will not be compatible.
If you're going to change the compression type for anything <= EL8, why not switch everything to zstd? It's supported since EL 8.2, and is the new default for createrepo_c 1.0 in any case.
Otherwise if EL7 is the limiting factor, just stick with an older version of createrepo_c for the next few months.
With the update to libcomps 0.1.20 and the change from in https://github.com/fedora-infra/bodhi/pull/5455 which will land in Bodhi 8.0, bodhi-composer should be able to handle all compression formats supported by createrepo_c 1.0 without errors.
As the compression method choice is hardcoded in bodhi's metadata.py, let me know it we want to change the setting for EL and/or Fedora. Or if we want to make the setting configurable per Release, in that case I'd like to ask a guidance about default values to use, as I will have to input settings for releases already in db. Per the current state, defaults would be: All FEDORA (ELN included) = XZ compression + zchunk enabled All FEDORA-EPEL = BZ2 compression + zchunk enabled
metadata.py
...or just keep things as they are now with hardcoded settings not changed.
Almost forgot, we still have to verify that RHEL 7 repositories created with bodhi + createrepo_c 1.0 will not be broken... otherwise we will have to stick with bodhi run on Fedora < 39 until RHEL 7 goes EOL.
At least for Fedora, zstd would be a safe choice.
I think it would be nice to have a global/default set and then able to override per release?
Then we could default to whatever, and override epel branches with whatever they need/are using. I'd not want to change any of the epel branches without the epel steering comittee signing off.
Its worth noting that there's a long time epel7 bug about disabling zchunk there ( https://bugzilla.redhat.com/show_bug.cgi?id=1721359 )
I don't recall if zstd would be better than xz for fedora, but if it was configurable we could adjust...
AFAIK the epel7 bug should be fixed, all EPEL composes are now built without zchunk.
I suppose I'll see to add those settings per release in Bodhi, defaulting to XZ+zchunk=on (I'd avoid to set ZSTD as default because it's not supported by createrepo_c < 1.0). Then I'll populate current releases in database matching current hardcoded values. And I'll drop a note to update SOPs when a new release is created.
After that, Releng and Epel comittee can decide to eventually update those settings as they wish, so the next compose will use the new settings.
But what I need to understand is: will EPEL7 composes be stopped after June 2024 (RHEL7 EOL)? If not, we will need a workaround / separate code to build those composes, as we will have to move on F39 sooner or later and repositories created using createrepo_c 1.0 are not compatible with RHEL7.
I just tested it. zstd is very slightly worse than bz2 at compressing updateinfo metadata (1.9mb vs 1.8mb for Fedora 38 updates (xz), 1.4mb vs 1.1mb for EPEL 8 (bz2))
Zstd is much faster to compress and decompress, but given the only metadata Bodhi is responsible for is the updateinfo (which is tiny, relatively speaking), the difference either way would be minuscule.
The only thing that will make a real difference is how primary.xml, other.xml, and filelists.xml are compressed, and right now that's gzip regardless of Fedora or EPEL. I'd like to see that switch over to zstd but it's out of scope for Bodhi.
Nice. Can we close that bug then?
I suppose I'll see to add those settings per release in Bodhi, defaulting to XZ+zchunk=on (I'd avoid to set ZSTD as default because it's not supported by createrepo_c < 1.0). Then I'll populate current releases in database matching current hardcoded values. And I'll drop a note to update SOPs when a new release is created. After that, Releng and Epel comittee can decide to eventually update those settings as they wish, so the next compose will use the new settings.
Sounds good.
Yes. epel7 will end at rhel7 eol. So, I guess we need to keep f38 on bodhi-backend01 until then... or start doing composes in containers or something else.
I've drafted a Bodhi PR which will allow to use custom createrepo_c settings. I made it working by an external config file, that can be maintained in our ansible repo, rather than inject settings in the database, so there will be no change in the current commands used to create a new release. Comments/thoughts?
From the createrepo_c CLI help I see there is a --general-compress-type=COMPRESSION_TYPE Which compression type to use (even for primary, filelists and other xml). so, maybe, this can be enabled.
--general-compress-type=COMPRESSION_TYPE Which compression type to use (even for primary, filelists and other xml).
I'm not terribly familiar with the codebase but I only saw code for generating the UpdateInfo metadata, nothing which suggested that Bodhi touched the rest of the metadata, which I assume must come from somewhere else. So it would have to be that other place which tweaked the --general-compress-type.
This is set in the pungi config file, which is generated from a template by bodhi-composer. For example, there's a conditional in the template that adds the extra argument --xz to createrepo_c command for EPEL8, so everything gets compressed for that release (if I'm not mistaken).
--xz
So, the compression can be configured in the pungi template, but I will look in using the createrepo_c config in the composer script, so that there's a single point of truth for both metadata generation and pungi.
I think we can close this ticket.
Issues about repository sanity check have been resolved with the update of libcomps already in repositories and with a change in Bodhi that will land in v8.0.
Bodhi v8.0 will also be able to define per release compression options, so maybe a change proposal can be filed to compress all data for a future Fedora release, or EPEL folks may want to switch the compression method for some release.
We just have to remember that, while deploying Bodhi v8.0 should be safe at any time, we must not upgrade the base image (or, at least, the bodhi-composer base) to F39 until we stop composing EPEL7.
ok. I added a note in ansible pointing to this and reminding not to update until after epel7 goes eol.
Thanks!
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.