#7445 [RFE][PATCH] make $releasever return "rawhide" on Rawhide
Closed: Fixed 3 years ago by kparal. Opened 6 years ago by kparal.

Background

If we want to have a more reliable and stable Rawhide, we need to make it easier to test and automate. That means eliminating the differences between Rawhide and stable releases and reducing the necessary manual maintenance steps as much as possible. You can read more about related issues in https://pagure.io/releng/issue/7398 and https://pagure.io/copr/copr/issue/267.

Problem

Currently dnf variable $releasever returns a number (29) on Rawhide, but all the repos are stored in rawhide/ directory, not 29/ (as with stable releases). There are good reasons for this, but it has consequences. It forces the official fedora repos to be split between fedora-repos and fedora-repos-rawhide (because you can't rely on a variable and have to hardcode "rawhide" in the repo path) and breaks copr and any other third-party repos. Basically for all repos, you need to have two separate versions - rawhide and non-rawhide - and always correctly detect and install the right one. I'd like to propose improvements in this area and discuss it with you in this ticket.

Proposed solution 1

Here's a trivial patch for fedora-release:

diff --git a/fedora-release.spec b/fedora-release.spec
index ecca47f..b4b66f2 100644
--- a/fedora-release.spec
+++ b/fedora-release.spec
@@ -1,6 +1,7 @@
 %define release_name Rawhide
 %define dist_version 29
 %define bug_version rawhide
+%define releasever rawhide

 # All changes need to be submitted as pull requests in pagure
 # The package can only be built by a very small number of people
@@ -19,6 +20,7 @@ Obsoletes:      redhat-release
 Provides:       redhat-release
 Provides:       system-release
 Provides:       system-release(%{version})
+Provides:       system-release(releasever) = %{releasever}

 # Kill off the fedora-release-nonproduct package
 Provides:       fedora-release-nonproduct = %{version}

This adds provision system-release(releasever) = rawhide to the master branch of fedora-release. Therefore, this provision will only be present for Rawhide version of that package. It uses DNF's detect_releasever() logic to populate $releasever with rawhide string (the new provides) instead of 29 (the version of the package). (Note: This is currently broken in DNF due to a bug, but it will be fixed in the next DNF release).

The outcome is that all repos can now use $releasever in URLs, because it will get replaced by rawhide and therefore reach the correct destination. That means you can use the same repo file as in a stable release for COPR or other third-party repo and it will work fine.

If the user wants to switch to Branched after branching has happened, they'd run e.g.:

sudo dnf distrosync fedora-release\* --releasever=28

Proposed solution 2

This is a similar approach to the first solution, but creates a fedora-release-rawhide subpackage:

diff --git a/fedora-release.spec b/fedora-release.spec
index ecca47f..74637f1 100644
--- a/fedora-release.spec
+++ b/fedora-release.spec
@@ -1,6 +1,7 @@
 %define release_name Rawhide
 %define dist_version 29
 %define bug_version rawhide
+%define releasever rawhide

 # All changes need to be submitted as pull requests in pagure
 # The package can only be built by a very small number of people
@@ -33,6 +34,15 @@ BuildArch:      noarch
 %description
 Fedora release files such as various /etc/ files that define the release.

+%package rawhide
+Summary:        Fedora release files for Rawhide
+Provides:       system-release(releasever) = %{releasever}
+Requires:       fedora-release = %{version}-%{release}
+
+%description rawhide
+This identifies the system as Rawhide for the package manager, causing Rawhide
+repositories to be used.
+
 %package atomichost
 Summary:        Base package for Fedora Atomic-specific default configurations
 Provides:       system-release-atomichost
@@ -315,6 +325,9 @@ glib-compile-schemas %{_datadir}/glib-2.0/schemas &> /dev/null || :
 %{_prefix}/lib/systemd/system-preset/99-default-disable.preset


+%files rawhide
+
+
 %files atomichost
 %{!?_licensedir:%global license %%doc}
 %license LICENSE

The difference here is that you can install/uninstall fedora-release-rawhide any time at will, which marks/unmarks your system to be following the Rawhide stream. The benefit is that you can switch your system from Rawhide to Branched before the branching actually happens, and your system automatically picks up the right repos after branching (which is awesome, especially for our automation needs). The downside is that both rawhide/ and 29/ repo URLs/paths need to be present and working during the whole life cycle of Rawhide, so that you can switch any time. And this doesn't apply just to official Fedora repos, but ideally also to COPR and other third-party repos. COPR devs wanted to avoid duplicated content or maintaining symlinks, but I guess they could be convinced. But other third-party repos might not follow this approach and the whole concept might be confusing for them (however, a good question is how many of those repos actually work on Rawhide already).

So to summarize, this is how you'd switch your system to Rawhide:

# use dnf system-upgrade to upgrade to Rawhide
sudo dnf install fedora-release-rawhide

And switching from Rawhide to Branched:

sudo dnf remove fedora-release-rawhide
sudo dnf distrosync  # if branching already happened

Fresh Rawhide installation would receive fedora-release-rawhide by default, of course.

Overall, this adds more user control at the expense of more infra work. Not sure if this is worth it or not.

Proposed solution 3

For the sake of completeness, I'll mention another approach how to achieve the goal without using new RPM provides. $releasever value can be overridden by a file
like this:

$ cat /etc/dnf/vars/releasever
rawhide

If this file was owned by fedora-release-rawhide, it would be very similar to solution 2 - you can switch the Branched/Rawhide stream any time. The implementation detail here is whether to mark this file as a config file or not, so that it doesn't e.g. stay around even after you remove fedora-release-rawhide, or that it doesn't e.g. conflict with an already existing file at that location (if the user wanted to override $releasever already for any reason).

Solution 2 seems a bit cleaner here because you don't need to bother with corner cases involving config file management.

Possible future steps for Fedora Releng

Once $releasever returns rawhide on Rawhide, you can (if you wish) drop fedora-repos-rawhide and use the same fedora-repos package everywhere (ideally also create empty updates/rawhide and updates/testing/rawhide repos as requested in https://pagure.io/releng/issue/7398). Some repo properties will still probably have different values (like metadata_expire), but that can be easily adjusted in the spec file and you can have the same source tarball for all releases, if you wish. This would make the environment even more consistent for users (the repo names would be named the same in all releases).

Known pitfalls

There's one known problem with any of the approaches suggested above, called PackageKit. PackageKit doesn't use DNF to figure out $releasever, nor it uses the same logic. Instead, it parses VERSION_ID from /etc/os-release (source1, source2). So any changes described here will not apply to PackageKit and it will still return a number (e.g. 29) as $releasever. That is something that of course needs to get resolved as well, but before investing time into fixing it, I first wanted to know whether this whole idea gets approved or not.

There are several approaches how to fix this in PackageKit, either retrieving $releasever from libdnf (when on Fedora), or implementing the same detection logic as in DNF, or perhaps adding VERSION_CODENAME=Rawhide to /etc/os-release and then special-casing this in PackageKit (however, this would break if solution 2 or 3 is used and the user can switch between streams arbitrarily). However, I'd like to first talk about the concept itself, and only after that start hammering out the implementation details with PackageKit developers.

Discussion

Please tell me what do you think about the proposed changes. Does it make sense? Have I overlooked something important? Are there better ways to solve the aforementioned issues?

Thank you.


FYI. DNF is working on C library libdnf and in long term is to move PackageKit to use this library so the basic logic (and caches) is shared.

Metadata Update from @mohanboddu:
- Issue tagged with: meeting

6 years ago

This will be discussed in our next RelEng meeting on May 10th 2018 at 17:00 UTC

So, whats the advantage here of trying to get 'rawhide' the word to work?
Can't we just make sure the number works and use that for everything via mirrormanager?
The comment in fedora-repos could note that if you are trying to use baseurl with rawhide you should use 'rawhide' in the url. Or perhaps we should bite the bullet when we branch f29 and just move 'rawhide' to '30'.

Or are there cases you want to know you are on rawhide?
Or are there cases you have to use the baseurl ?

So, whats the advantage here of trying to get 'rawhide' the word to work?
Can't we just make sure the number works and use that for everything via mirrormanager?

That would resolve many problems, but create some new ones, I think. The benefit of having $releasever=rawhide is that you can have a machine that is always rawhide. If we use $releasever=NN (and use $releasever in repo URLs), it means that in order to keep following rawhide on your machine, you'll need to make manual changes twice a year (during Branching points), exactly at the right time. It also means that third-party repos (including copr) stop working right after Branching, until the maintainers add the new repo directory.

You can argue that these are not new problems, it just shifts them around, because you already have the same issue with Branched (if you want to follow Branched instead of Rawhide, you need to do a manual change, and also third-party repos don't work for Branched until the maintainers add the relevant repo directory). The question is which use case is more important, I don't know.

Please also note that COPR already can use numbered repos for Rawhide (i.e. having 29/ dir instead of rawhide/, and then using $releasever in their universal repo file). Instead they opted for the same approach Fedora infra uses, i.e. having a specific rawhide repo file with rawhide/ dir hardcoded). I don't know whether they simply copied Fedora infra approach or it was a conscious decision that made their implementation easier. My goal is to make all this simpler and consistent, and that's why I suggested changing $releasever to rawhide, instead of keeping it as a number and convincing everyone to support it in their repo dir structure. We can definitely consider the latter approach as well, though.

The comment in fedora-repos could note that if you are trying to use baseurl with rawhide you should use 'rawhide' in the url. Or perhaps we should bite the bullet when we branch f29 and just move 'rawhide' to '30'.

I'm not sure I understand the difference between this and your second sentence ("make sure the number works and use that for everything").

Or are there cases you want to know you are on rawhide?

If all the repos work out of the box when having $releasever in their URLs, I think I don't need to know whether I'm on Rawhide or not.

Or are there cases you have to use the baseurl ?

For all repos not using MM, i.e. copr and other third-party repos. Also, there was some time when mirrorlist support was broken in kickstart, but I guess that is resolved now.

Please also note that COPR already can use numbered repos for Rawhide (i.e. having 29/ dir instead of rawhide/, and then using $releasever in their universal repo file). Instead they opted for the same approach Fedora infra uses, i.e. having a specific rawhide repo file with rawhide/ dir hardcoded). I don't know whether they simply copied Fedora infra approach or it was a conscious decision that made their implementation easier. My goal is to make all this simpler and consistent, and that's why I suggested changing $releasever to rawhide, instead of keeping it as a number and convincing everyone to support it in their repo dir structure. We can definitely consider the latter approach as well, though.

I asked COPR devs at https://pagure.io/copr/copr/issue/267#comment-510456 whether they would consider using this approach (a numbered dir even for rawhide), and so far they seem to be on board with it, if it is the same approach Fedora Releng uses. So if we can make sure MirrorManager can handle repo=NN requests even for Rawhide (which was already implemented in https://pagure.io/releng/issue/7398#comment-502442 , but just temporarily for this particular release; we'd need to make sure it works in the future as well), and COPR implements this on their side, maybe we don't need any changes to fedora-release as proposed in this ticket at all.

If we use $releasever=NN (and use $releasever in repo URLs), it means that in order to keep following rawhide on your machine, you'll need to make manual changes twice a year (during Branching points), exactly at the right time.

It occurred to me that this might have a pretty elegant solution. Let's say that by default fedora-repos are installed, which use $releasever in repo URLs. Those machines would automatically get converted from Rawhide to Branched after branching. However, if you disabled those repos and installed fedora-repos-rawhide (which have repo=rawhide hardcoded in metalink URLs and baseurls), then such a machine would always stay on Rawhide. So you would actually have a way to avoid manual changes during branching, you could choose in advance which path to follow. This approach looks like the best of both worlds, honestly.

//Edit: Some concerns were raised by COPR developers since I wrote this ticket, so it's not that clear cut. See the ticket for more details.

Can we sum-up what's the problem with Proposed solution 1? I love that... seems to be clear winner because (a) nothing has to be changed from rel-eng POV and (b) the only thing is that we have to pay for some additional care of difference in fedora-release.spec between branched and master branches, but that's trivial.

-1 for other solutions; as that needs additional work on rel-eng side (taking care of branched directory which represents not-yet-really-branched rawhide).

So, I re-read all this in prep for posting to the devel list about it, but I think we need to narrow down things a bit more before we do that.

I was thinking we could just do away with 'rawhide' and switch always to numbers, but I agree the downside there is that you no longer really 'know' you are on rawhide and you would have to make choices around the branching point. Or I suppose we could still have a fedora-release-rawhide subpackage to denote that, but switch everything to using the numbers otherwise.

It sounds to me like Proposed solution 2 might be the best way forward. It's unclear to me from reading the copr ticket if they were ok with that solution or not?

Also, I'll ping @kwizart here in case he would like to chime in any about another 3rd party repo here.

I was thinking we could just do away with 'rawhide' and switch always to numbers, but I agree the downside there is that you no longer really 'know' you are on rawhide and you would have to make choices around the branching point. Or I suppose we could still have a fedora-release-rawhide subpackage to denote that, but switch everything to using the numbers otherwise.

I think some people will definitely want to have an always-Rawhide machine, without doing manual changes every 6 months. So the fedora-release-rawhide subpackage would definitely be useful.

There are two ways to do that, I think. The special Provides can set releasever=rawhide, as I suggested. That means that MirrorManager would still need to support releasever=rawhide in URLs and redirect to the correct number. Baseurls would probably be partially broken, because you can't use symlinks on mirrors (any hope in improving that in future?). The advantage is that just a simple change on Fedora servers (pointing rawhide elsewhere) would immediately serve the right content to all such machines.

The second option is to set releasever=NN, and after the branching point, issue an update that sets releasever=NN+1. The next dnf update would then pull from NN+1 repos, updating fedora-release, and thus updating distribution version. The upside is that Fedora Releng doesn't need to bother with rawhide redirects (even though, some people could still find them useful to exist). The downside is that Releng needs to bother with updating fedora-release* in stable releases, and the system would be upgraded to the new Rawhide one update later than in the first option (the first dnf update to install updated fedora-release-rawhide, the second dnf update to upgrade to the new Rawhide). Probably not a big deal, and I can't think of any further drawbacks now.

It sounds to me like Proposed solution 2 might be the best way forward. It's unclear to me from reading the copr ticket if they were ok with that solution or not?

@clime didn't like maintaining symlinks or keeping the repo content duplicated in several directories. The symlink or duplication is required, though, in cases where a release can be addressed in multiple ways (releasever=rawhide vs releasever=NN). The argument also was that they're doing it the same way Fedora Releng does (except for having a MirrorManager). If Fedora Releng changed their ways, perhaps they would reconsider (can't speak for them, @clime, can you comment)?

It occurs to me, though, that if you opted for the second proposed option of always setting releasever=NN (numerical only), they would never receive releasever=rawhide requests, and they wouldn't need to bother with the rawhide/ symlink. It would be simpler for them, and if the user had Follow Fedora branching enabled in COPR, they would just create a repo definition + new directory for the new Rawhide (and either copy contents from Rawhide-1 or not, I don't know their current practice, but only once, not regularly). Perhaps they could accept this behavior. The only downside is compatibility, their current Rawhide repos have rawhide/ hardcoded in them, so users would need to update their repo files.

So, forgive me this but I see that "only numbers" vs. rawhide is like choosing between the following two variants: numbers_vs_rawhide.png

In the first one, at each branching point, what's being branched off is not what will soon be released but instead it is...what will be released later. It contradicts what branching means within DistGit context.

Also, it seems to me that if we ever want to have a continuous delivery distro, then we should keep rawhide and make it "stronger" instead of weaker/non-existent.

Speaking now as a Copr maintainer, I would like to keep copr-backend layout simple. Symlinking or data duplication seems to me a little bit like working around something that should be (to more benefit) fixed somewhere else.

Thank you @clime for a great picture, I very appreciate the effort :-) The right side doesn't show the fact, though, that Rawhide still has a number that's getting bumped at every branch point. And that number is actually used everywhere except repo requests (and that's just because the rawhide string is hardcoded there). The reason is that most specs and tools are based on and mandate using numbers. It's true for /etc/os-release, it's true for spec files (if %{fedora} > 20). And that's the root of my issues when doing automation work, the system claims to be a number if many areas, but network repos are only accessible under a string label, and I need to maintain the mapping in all our tooling and make sure I update it everywhere at a precisely the right time.

In Proposed solution 1, I tried to fix this by making sure the special knowledge for network requests isn't needed (at least when initiated from the system). Instead of sending releasever=NN it would send releasever=rawhide. It should be the most compatible change for COPR devs - you can keep rawhide/ dirs on server, you don't need to maintain rawhide -> NN/ symlinks, it's backward compatible, and you can simplify your repo files (you no longer need to have a special repo file for Rawhide). @clime, am I missing some disadvantages for COPR?

Thank you @clime for a great picture, I very appreciate the effort :-) The right side doesn't show the fact, though, that Rawhide still has a number that's getting bumped at every branch point. And that number is actually used everywhere except repo requests (and that's just because the rawhide string is hardcoded there). The reason is that most specs and tools are based on and mandate using numbers. It's true for /etc/os-release, it's true for spec files (if %{fedora} > 20). And that's the root of my issues when doing automation work, the system claims to be a number if many areas, but network repos are only accessible under a string label, and I need to maintain the mapping in all our tooling and make sure I update it everywhere at a precisely the right time.
In Proposed solution 1, I tried to fix this by making sure the special knowledge for network requests isn't needed (at least when initiated from the system). Instead of sending releasever=NN it would send releasever=rawhide. It should be the most compatible change for COPR devs - you can keep rawhide/ dirs on server, you don't need to maintain rawhide -> NN/ symlinks, it's backward compatible, and you can simplify your repo files (you no longer need to have a special repo file for Rawhide). @clime, am I missing some disadvantages for COPR?

No, I think that's a very good description. As you say, the solution no. 1 looks the best to us from compatibility point of view.

The downside I have with solution 1 is that all users follow rawhide. In the past we wanted users to follow branched by default because thats where we want more testing. I suppose this is pretty minor however...

Honestly my preference would be that a Rawhide install follows Rawhide by default. The user wanted Rawhide and she's got it. If she wanted Branched, she would've installed Branched (or made a manual switch from Rawhide). It would be nice if you could configure it in advance before the branching point (perhaps even in the installer), but that would require solution 2 which puts a bit more pressure on COPR and other third-party repos. It seems we can't have everything :cake:.

Keep in mind that for any change mentioned here except for pure numbers-only releases, PackageKit needs to get fixed as well. If we agree on a solution, I'll need to follow up and start bugging PK developers.

Honestly my preference would be that a Rawhide install follows Rawhide by default. The user wanted Rawhide and she's got it. If she wanted Branched, she would've installed Branched (or made a manual switch from Rawhide). It would be nice if you could configure it in advance before the branching point (perhaps even in the installer), but that would require solution 2 which puts a bit more pressure on COPR and other third-party repos. It seems we can't have everything 🍰.

Sure. I think it is an improvement over what we have now.

Keep in mind that for any change mentioned here except for pure numbers-only releases, PackageKit needs to get fixed as well. If we agree on a solution, I'll need to follow up and start bugging PK developers.

Also: dnfdragora developers and ostree/rpm-ostree might be affected.

ok, I am sold... we can discuss this at thursdays releng meeting to make sure everyone else is on board too. I'd like to make sure this is visible, so I think we need a devel list thread/announcement on it? Do you want to do that or shall I? (after thursday)?

ok, I am sold... we can discuss this at thursdays releng meeting to make sure everyone else is on board too. I'd like to make sure this is visible, so I think we need a devel list thread/announcement on it? Do you want to do that or shall I? (after thursday)?

If the change is approved at the meeting, we can announce the intention on the devel list (I don't care who does that, whatever you prefer) but I think we need to be clear that it's not going to be implemented very soon. First I need to talk to PackageKit developers and find out when they can fix it. People would probably be angry if we broke PackageKit in Rawhide. And as you say, there might be other parties affected. So the announcement would mainly serve the purpose of identifying tools that could be affected and need to be checked.

This got approved in the latest meeting:
https://meetbot.fedoraproject.org/teams/releng/releng.2018-07-26-17.00.log.html
with this action item assigned to me:

17:28:03 <mboddu> #info kparal is going to create the tracking bug and then email to devel list about the proposal (option #1) on https://pagure.io/releng/issue/7445 adn then talk to PK devs. Once all the necessary changes are in place, we will update fedora-release package.

Maybe we could present "how it works" by updating and building the corresponding packages in copr projects (fedora-repos, etc.)?

is there a TL;DR on how that proposal went ?

I replied to everyone, there were not that many concerns, I hope I covered the ones that were voiced. I haven't received any reports about additional software that could get broken by this change. I need to talk to PackageKit developers next, because that's the remaining major roadblock, I believe.

Maybe we could present "how it works" by updating and building the corresponding packages in copr projects (fedora-repos, etc.)?

That's a great idea. I'll create a copr for easy testing.

Ugh, I have to admit I'm quite low on available time :disappointed: Would anybody want to take this and push this forward? ("Proposed solution 1" was the approved implementation). The thing missing is to run this through PackageKit developers (Richard and Kalev), described in "Known pitfalls".

I did a completely untested patch for packagekit to look at system-release(releasever) provides: https://github.com/hughsie/PackageKit/pull/310

Big kudos to @kalev, he made PackageKit aware of DNF's way to determine $releasever, and it should be included in PackageKit > 1.1.12. Once that happens, this ticket can be pushed further. It can also be tested with the following COPR:
https://copr.fedorainfracloud.org/coprs/kparal/rawhide-releasever/packages/
I made changes to fedora-release and fedora-repos to implement the "Proposed solution 1".

Metadata Update from @syeghiay:
- Issue assigned to mohanboddu

5 years ago

@mohanboddu will bring this up in the next Releng meeting.

A new PackageKit with the included fixes haven't yet been released, unfortunately.

A new PackageKit with the included fixes haven't yet been released, unfortunately.

It's basically dead, if there's patch for it we could likely apply it to the Fedora package.

https://blogs.gnome.org/hughsie/2019/02/14/packagekit-is-dead-long-live-well-something-else/

I went ahead and backported the patch. Should be fixed in PackageKit-1.1.12-8.fc31

I have updated my copr [1] to only include patched fedora-release and fedora-repos and aligned them with F31. Unfortunately, dnf logic in detecting releasever got broken once again, probably due to python-rpm changing return values from bytes to string (yay for API stability), and it currently doesn't work properly :unamused: It is already fixed, just not released yet - should be part of dnf >= 4.2.7, once it is built in Fedora.

[1] https://copr.fedorainfracloud.org/coprs/kparal/rawhide-releasever/

dnf-4.2.7-2.fc31 is now part of Rawhide, and this functionality can be tested.

Original behavior:

$ python3 -c 'import dnf; print(dnf.rpm.detect_releasever("/"))'
31
$ dnf repolist --repo fedora -v | grep metalink
Repo-metalink: https://mirrors.fedoraproject.org/metalink?repo=fedora-31&arch=x86_64

New behavior after upgrading packages from the COPR repo linked above:

$ python3 -c 'import dnf; print(dnf.rpm.detect_releasever("/"))'
rawhide
$ dnf repolist --repo fedora -v | grep metalink
Repo-metalink: https://mirrors.fedoraproject.org/metalink?repo=fedora-rawhide&arch=x86_64

Is there anything else I can do to move this forward?

Well, are we ready to just do this for f32 rawhide?

I've updated the COPR repo to work again with Rawhide now that it's F32, so it can be easily tested.
Here are the current diffs for fedora-release and fedora-repos:
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864

I can submit PRs, just tell me.

Is there a $ missing here or ?

https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864#file-fedora-release-diff-L24

What is this change on lines 19--20?
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864#file-fedora-repos-diff-L19

I'd say leave the 32 key there as a link to rawhide for now?

Finally, should we put in a f32 change for this? It might be good to get more visibility before making the change...

Have you done any testing with it and do things generally look ok?

Is there a $ missing here or ?
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864#file-fedora-release-diff-L24

Ah, that's not going to be in the final PR :-) That's just a way to ensure that the experimental package from my COPR always beats the package from Rawhide.

What is this change on lines 19--20?
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864#file-fedora-repos-diff-L19

That's a trailing space fix, inconsequential. That got added automatically when I edited the file, doesn't need to be part of the final PR.

I'd say leave the 32 key there as a link to rawhide for now?

Do you mean this?

rename from RPM-GPG-KEY-fedora-32-primary
rename to RPM-GPG-KEY-fedora-rawhide-primary

In my experience, the GPG key gets looked up according to $releasever, so it has to be named fedora-rawhide-primary now. Of course we can create a symlink fedora-32-primary -> fedora-rawhide-primary. But because this change is intended just for Rawhide (and not older releases), I assumed this wasn't necessary. But thinking more about it (e.g. regarding upgrades) it might be safer to have both. I'll add the symlink (and also keep both entries in archmap).

Finally, should we put in a f32 change for this? It might be good to get more visibility before making the change...

There was a devel discussion about it, but not many people participated. I guess it might be a good idea to have a Change proposal for this. Does anyone from Releng/Infra want to create it, or should I? In the latter case, who'd want to co-own the Change with me?

Have you done any testing with it and do things generally look ok?

I don't run a Rawhide system long-term anywhere, so I performed testing only in a VM. It was nothing extensive, just making sure the DNF and repos work as expected. Now that F31 is properly branched, I can perform some more testing (switching from Rawhide to Branched and back).

What would be the best time to make this live in Rawhide? Very soon, so that people can experiment with Branched interactions as well, or once F31 is stable, so that we don't introduce Rawhide-related issues during F31 stabilization? Even after reading the Change policy, I'm unclear whether this needs to wait for FESCo approval before putting it into Rawhide, whether I can already propose changes for F32 and whether gets approved soon (or just after F31 is out).

I've updated fedora-repos to include both fedora-rawhide and fedora-32 GPG key files. I've updated the diff here:
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864

I'll try to perform some additional testing today.

I tested Rawhide system and standard dnf operation on Fedora and third-party repositories (COPR and fully third-party) work fine. Gnome Software is broken as always, but I managed to install some packages, so the plumbing is fine (the problems are not related to this change).

I identified a problem in rpmfusion subrepositories that get shipped officially in our fedora-workstation-repositories package (steam, nvidia). Their mirrormanager still doesn't support 31/32/rawhide as $releasever, so those repos are not accessible. The full rpmfusion repos support these versions, so this is somewhat related, but not a fault of the proposed change. I've filed a bug [1].

I realized that the command to switch from Rawhide to Branched is a bit more complicated than originally claimed in Proposed solution 1. It is:

sudo dnf distrosync --refresh fedora-release\* --releasever=31 --repo=fedora --repo=updates
sudo dnf distrosync --refresh

The reason is that fedora-rawhide.repo uses rawhide hardcoded instead of $releasever. I can change that as part of my patch, but that would cover just the base fedora repo and you'd still have to enable updates manually. I think repo definitions can get a proper overhaul once the proposed change is in place (as mentioned in Possible future steps for Fedora Releng in comment 0).
Those commands above work fine in general, but on my particular system some package downgrade seriously messed up the whole system (shouldn't be relevant to this proposal).

When going from Branched to Rawhide, I had to use this command:

sudo dnf distrosync --refresh fedora-release\* --releasever=rawhide --repo=fedora
sudo dnf distrosync --refresh

(and optionally --repo=copr-kparal-rawhide-releasever to have my patches applied immediately). Ideally that --repo=fedora shouldn't be needed, but I discovered an issue in Fedora's mirrormanager - it knows the rawhide == 32 alias for standard repos, but doesn't recognize it for modular repos. That's something we need to fix, I've filed a separate infra ticket about it [2]. Apart from that, it again seemed to work fine in general.

[1] https://pagure.io/fedora-workstation/issue/102
[2] https://pagure.io/fedora-infrastructure/issue/8134

I've updated fedora-repos to include both fedora-rawhide and fedora-32 GPG key files.

This might need some additional thought from the Releng team. I'm not sure I fully understand how GPG keys are maintained and used. The major difference with this patch is that the Rawhide key is no longer dynamically named (RPM-GPG-KEY-fedora-$releasever-primary), but it is static (RPM-GPG-KEY-fedora-rawhide-primary). Of course symlinks can be used, the file can be regenerated periodically, etc. Or the Rawhide key can be the same key forever. So somebody knowledgeable needs to say whether I should adjust the PR somehow or not. Also, some keys are distributed in distribution-gpg-keys, and I don't know whether to adjust it in some way as well.

I tested an upgrade from Branched to Rawhide using dnf system-upgrade --releasever=rawhide, it worked fine, with the exception of modularity repos not working (as reported above) and missing RPM-GPG-KEY-fedora-rawhide-* keys (those need to be present in all releases, not just Rawhide).

I've updated fedora-repos to include both fedora-rawhide and fedora-32 GPG key files.

This might need some additional thought from the Releng team. I'm not sure I fully understand how GPG keys are maintained and used. The major difference with this patch is that the Rawhide key is no longer dynamically named (RPM-GPG-KEY-fedora-$releasever-primary), but it is static (RPM-GPG-KEY-fedora-rawhide-primary). Of course symlinks can be used, the file can be regenerated periodically, etc. Or the Rawhide key can be the same key forever. So somebody knowledgeable needs to say whether I should adjust the PR somehow or not. Also, some keys are distributed in distribution-gpg-keys, and I don't know whether to adjust it in some way as well.

I don't think we want to keep the same rawhide key forever, I think it's still good to move to a new one at the branching point.

Question: Does rpm/dnf handle the case where the key file is named the same, but changes content? ie, say we land this soon and when we branch f32 off we switch rawhide to a new key with the same filename. Does dnf reimport it? say it's already imported and fail?

Question: Does rpm/dnf handle the case where the key file is named the same, but changes content? ie, say we land this soon and when we branch f32 off we switch rawhide to a new key with the same filename. Does dnf reimport it? say it's already imported and fail?

I tested it and it works fine. If a package is signed with a key that's not imported, it checks the file specified in gpgkey=. If the key is already imported (based on the file content, not a filename), it fails the transaction. If the key is not already imported, it asks you for confirmation to import it. This works even when the gpg file content changed a second ago (it reads it every time some key is missing).

But I'm not sure how you handle the transition period itself. Will you sign fedora-gpg-keys with the old key (so that it can be installed on existing systems) and the rest of packages with the new key? That would break fresh installations. What other choice there is?

But I'm not sure how you handle the transition period itself. Will you sign fedora-gpg-keys with the old key (so that it can be installed on existing systems) and the rest of packages with the new key? That would break fresh installations. What other choice there is?

Yeah, I am not sure either. ;( I suppose yeah, we would need to sign everything with the new key except fedora-repos then wait a while (a week?) and sign it with the new key. But then people would need to update just that one package first then the rest. I sure wish you could sign rpms with multiple keys, but thats not the case. ;(

So how is Rawhide signing transition handled currently? Is it significantly different to the problems listed above? If we need to find a good solution to this before moving this ticket forward, it might be a good idea to look at how other rolling rpm-based distributions do this.

The last few times we have branched off the new release, then switched rawhide compose to use the new rawhide key and made sure everything was signed with that key.

So we could:

  • Just sign one package with the new key at branch point: fedora-release. This would allow them to update fedora-repos and import the new key.
  • Wait some time (a day? a week?)
  • switch everything over to the new key.

Or I suppose we could just set a dedicated rawhide key and never change it. I really don't like that because if we do have to change it, we have no good way to. ;(

The last few times we have branched off the new release, then switched rawhide compose to use the new rawhide key and made sure everything was signed with that key.

That means that Rawhide users were not able to update the system without using --nogpgcheck, because the new packages were suddenly signed with a new key that they didn't have on their system. Is that correct?

So we could:

Just sign one package with the new key at branch point: fedora-release. This would allow them to update fedora-repos and import the new key.
Wait some time (a day? a week?)
switch everything over to the new key.

Alternatively, there could be instructions on the wiki how to download&import the new key, which would work even outside of that ~1week timeslot. But none of these are easily automatable, and that's ugly.

I might have found a solution :wink: DNF specifies that the gpgkey= variable in a .repo file is a list (values are separated by commas or spaces). So you can introduce the key for Rawhide+1 well in advance and give users plenty of time to receive new fedora-gpg-keys and fedora-repos. Ideally, when you branch e.g. F31 and Rawhide becomes F32, at the very same time you can update it to include the key for F33. That means that if users update their Rawhide systems at least once per 6 months, they will never be cut off from updates because of a new GPG key. Also, they can update the whole system in one transaction, without thinking about "I need to update fedora-release/repos/keys first".

The repo files would change from this:

gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch

to something like this:

gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-33-$basearch

This can be applied just for Rawhide, or even for stable systems (because the two keys can translate to the same file, there's no problem in it). The line loses some elegance, but I think the advantages seem quite convincing.

During Branch event (e.g. F31), the existing F32 key would get copied/symlinked to the Rawhide key:

ln -sf RPM-GPG-KEY-fedora-32-primary RPM-GPG-KEY-fedora-rawhide-primary

and a new RPM-GPG-KEY-fedora-33-primary key would get generated. The repo files would get updated as shown above.

I'm pretty happy about this solution and haven't found any logical flaw. WDYT?

Metadata Update from @kevin:
- Issue tagged with: backlog

5 years ago

Sorry for the delay here.

I don't see any obvious problems. :)

Great, I'll update the pull request diff with the new approach of specifying gpg keys and let you know.

Here's the updated diff:
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864
And here's the updated COPR repo to play with:
https://copr.fedorainfracloud.org/coprs/kparal/rawhide-releasever/

I have created a fake RPM-GPG-KEY-fedora-33-primary key file (needs to be filled in with actual content) and instructed rawhide repos to check it. If we do it like this well in advance (ideally during the previous Branch point), there should be no problems in having a fully functional signed Rawhide upgrade at any point of time.

Does it look reasonable? What should be our next steps?

Metadata Update from @cverna:
- Assignee reset

4 years ago

Ping, could we move this further. We would like to use this solution in Anaconda to import keys after the installation:

https://bugzilla.redhat.com/show_bug.cgi?id=1882712

ok, so whats left to do here? Just the fedora-release / fedora-repos changes?

@kparal would you be willing to submit PR's for those? we can just get this over the line and done...

Sorry for the long delay/dropping off radar here. We should get this done. :)

I have to say I have troubles remembering all the intricate details after a year or two :smiley: I'm working on rebasing the patches and testing the changes once again. In the meantime, it would be helpful to get this infra issue resolved, thanks:
https://pagure.io/fedora-infrastructure/issue/9621

Yeah, I don't remember all the details either... but it would sure be nice to get this done. :)

I updated (and somewhat simplified) the diff:
https://gist.github.com/kparal/b9a35d2b66e5401914f4cd67973e0864
RPM-GPG-KEY-fedora-35-primary is just a copied F34 key, it'll need a proper key created.

Here's the updated COPR repo to play with:
https://copr.fedorainfracloud.org/coprs/kparal/rawhide-releasever/

I got stuck on some PackageKit issues during testing, turned out to be an unfortunate caching implementation. I still need to test a few edge-cases (mostly involving PK, dnf works just fine), but I should be hopefully able to submit the PRs on Monday.

We might be a bit tight on time... tuesday is branching. Do we want to try and land this the day before? Or just wait until after?

At this point I think it's better to wait after branching.

(Pagure removed all tags for some reason when I submitted this comment, I added them back).

Metadata Update from @kparal:
- Issue untagged with: backlog, meeting

3 years ago

Metadata Update from @kparal:
- Issue tagged with: backlog, meeting

3 years ago

Cool. I'm off today, but hopefully monday we can merge those and also push out updates with the new f36 key. Thanks @kparal

PR's are merged. Once we ship the RPMs in rawhide this issue can be closed.

The RPMs are built and shipped in Rawhide, and it doesn't seem to be on fire :grinning: I also wanted to update documentation of how to switch between Rawhide and Branched, now that things changed a little. But I found out that the new GPG keys and symlinks need to be updated in stable releases as well, if the instructions are supposed to be simply. So I created PRs [1] [2] [3]. Then I realized I should update the Releng SOP to make sure the instructions talk about updating GPG keys in stable releases as well [4]. But seeing how ugly the SOP diff was, I went overboard and added automation to fedora-repos spec, so that repo files don't need to be managed by hand anymore [5]. So, ugh, sorry for spamming you with so many changes.

Once that all settles and propagates to stable (at least the keys, if not my patches), I'll work on updating the user documentation (Rawhide<->Branched switching) related to this change.

[1] https://src.fedoraproject.org/rpms/fedora-repos/pull-request/101
[2] https://src.fedoraproject.org/rpms/fedora-repos/pull-request/102
[3] https://src.fedoraproject.org/rpms/fedora-repos/pull-request/103
[4] https://pagure.io/releng/pull-request/10037
[5] https://src.fedoraproject.org/rpms/fedora-repos/pull-request/104

Metadata Update from @kparal:
- Issue untagged with: backlog, meeting

3 years ago

(Wth is wrong with Pagure and tags).

Metadata Update from @kparal:
- Issue tagged with: backlog, meeting

3 years ago

Progress report:

I believe that concludes this ticket. Thanks everyone for helping out. Rawhide is now closer to all other releases (just rolling), less special-handling. :beer: :wine_glass: :cocktail: :tropical_drink: :tea:

In the future, it would be nice to look at unifying the dnf repo definitions across releases, so that fedora-rawhide*.repo exists no more. That would be a different ticket, though :-)

Metadata Update from @kparal:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Awesome. Thanks for driving this forward @kparal ! Many thanks!

Log in to comment on this ticket.

Metadata