#2582 F35 Change: rpmautospec - removing release and changelog fields from spec files
Closed: Accepted 3 years ago by zbyszek. Opened 3 years ago by bcotton.

The goal of this change is to deploy in production the rpmautospec project.

With it, the content of the Release and %changelog fields in spec files can be auto-generated, either locally or in the builder using the information present in the git repo (in the form of git tags).

Note: This proposal is about changing the way the Release and %changelog sections of the spec files are filled, not about removing them from the SRPM or binary RPM.


After a week, there are no votes. Will wait for votes.

I left another comment on the devel list with an issue that came to my mind. I'll vote after things have been clarified.

This is a major improvement over the status quo so you can have my +1.

Personally, I'd also rather see git history as the only source of truth, but well... It was @pingou and @nphilipp who decided to invest their resources into this and not me and I'd rather have this than nothing.

I really like how this is looking now. +1

I'll leave a -1 vote here so this does not get autoapproved.

I'll leave a -1 vote here so this does not get autoapproved.

Technically, a single +1 was enough to approve it since a week had passed. But this far out from the deadline, it makes sense to not be aggressive with this. Tagging for meeting due to the -1.

Metadata Update from @bcotton:
- Issue tagged with: meeting

3 years ago

I'm not sure if I will make it for the meeting today, as I'll be in two concurrent video meetings during the meeting slot, so I'll try to summarize my arguments here in case I don't make it:

  • I appreciate the work that went into the current implementation, and in general, I like the idea
  • I do, however, dislike the proposed approach, because the implementation sounds brittle (synchronizing git tags / koji build history / local git checkouts) instead of using the (immutable!) commit graph as single source of truth everywhere, which would always lead to consistent results (I had a POC of this, and shared the algorithm I used on the devel list)
  • I dislike the idea of having multiple rebuilds per commit (are those distinguished by git tags? where would changelog information for those subsequent rebuilds come from?) and consider them an anti-feature
  • if we would vote on these kinds of changes based on the idea and not the implementation, we should have accepted the rpkg template / macro based approach as well - so implementation details obviously do (and should) matter, especially if they will determine how robust and future-proof the system will be

TL;DR: I like the idea, but dislike the implementation approach sufficiently strongly enough that I will leave my vote as -1, at least for now.

FWIW I agree with @decathorpe's arguments, they are just not that important to me to actively vote against this. But I would love to hear from the change owners if this is negotiable or not.

I'm not sure if I will make it for the meeting today, as I'll be in two concurrent video meetings during the meeting slot, so I'll try to summarize my arguments here in case I don't make it:

  • I appreciate the work that went into the current implementation, and in general, I like the idea
  • I do, however, dislike the proposed approach, because the implementation sounds brittle (synchronizing git tags / koji build history / local git checkouts) instead of using the (immutable!) commit graph as single source of truth everywhere, which would always lead to consistent results (I had a POC of this, and shared the algorithm I used on the devel list)

Aside from the problems in counting from the commit graph, we also have another issue with using that: the Git data is not available during package build. Koji's architecture for package builds makes it impossible to convey the commit graph when the package build actually happens, because the derived number would be available when the SRPM is generated and then reset to zero when we build from the SRPM to produce the file RPMs. The reason for the Koji synchronization is because the Koji server itself can pass the value back to any stage at any time.

I don't actually like the way we do this either. I wanted the Release: field just completely rewritten at SRPM generation time, rather than relying on a macro to be expanded at each SRPM-based RPM build run. That makes it so SRPM generation is stable without requiring a service to consistently give you data. Unfortunately, people didn't like that idea.

  • I dislike the idea of having multiple rebuilds per commit (are those distinguished by git tags? where would changelog information for those subsequent rebuilds come from?) and consider them an anti-feature

This is basically the whole point of this feature. Without being able to do this, there is no point in doing this. The goal is to make it possible for dependency rebuilds to not require human intervention, because not everyone is equipped to be able to do them (you need to be a provenpackager to do that).

The problem we have right now is that auto-rebuilds are spammy in Fedora infrastructure because they are considered special events. We need to make it so they aren't.

  • if we would vote on these kinds of changes based on the idea and not the implementation, we should have accepted the rpkg template / macro based approach as well - so implementation details obviously do (and should) matter, especially if they will determine how robust and future-proof the system will be

The rpkg template approach takes this concept to its logical conclusion by making unusable RPM spec files that require another layer of preprocessing.

TL;DR: I like the idea, but dislike the implementation approach sufficiently strongly enough that I will leave my vote as -1, at least for now.

To be blunt, I don't like autogenerating the %changelog from Git at all. The only thing I want is the ability to have Koschei do ordered rebuilds when dependency drift occurs and submit it rather than me having to do it by hand.

I don't actually like the way we do this either. I wanted the Release: field just completely rewritten at SRPM generation time, rather than relying on a macro to be expanded at each SRPM-based RPM build run. That makes it so SRPM generation is stable without requiring a service to consistently give you data. Unfortunately, people didn't like that idea.

But that's exactly what the proposed rpmautospec koji plugin does?
https://docs.pagure.org/fedora-infra.rpmautospec/principle.html

I don't actually like the way we do this either. I wanted the Release: field just completely rewritten at SRPM generation time, rather than relying on a macro to be expanded at each SRPM-based RPM build run. That makes it so SRPM generation is stable without requiring a service to consistently give you data. Unfortunately, people didn't like that idea.

But that's exactly what the proposed rpmautospec koji plugin does?
https://docs.pagure.org/fedora-infra.rpmautospec/principle.html

Ah, you're right. I missed that bit. :sweat_smile:

  • I dislike the idea of having multiple rebuilds per commit (are those distinguished by git tags? where would changelog information for those subsequent rebuilds come from?) and consider them an anti-feature

This is basically the whole point of this feature. Without being able to do this, there is no point in doing this. The goal is to make it possible for dependency rebuilds to not require human intervention, because not everyone is equipped to be able to do them (you need to be a provenpackager to do that).

The problem we have right now is that auto-rebuilds are spammy in Fedora infrastructure because they are considered special events. We need to make it so they aren't.

So we have a proposal and two ideas that cannot compromise. Either we require
a new commit for every release or we allow to build multiple releases from a
single commit.
We have people who do not like the first approach and people who do not like the
second approach, so it looks like we're not going to find a solution that suits
everyone.

What do we want to do then?

Re: @decathorpe's comments:

  • I do, however, dislike the proposed approach, because the implementation sounds brittle (synchronizing git tags / koji build history / local git checkouts) instead of using the (immutable!) commit graph as single source of truth everywhere, which would always lead to consistent results (I had a POC of this, and shared the algorithm I used on the devel list)

It's less brittle than it might seem at first: when building the SRPM, Koji uses its build history to tag latest builds per Fedora release in the locally checked out dist-git repository and then runs the script to bump the release and fill in the changelog (which uses the git tags to do its thing). The Koji plugins will attempt to push the git tags it created to the dist-git repo, but won't fail on these steps.

We went with an attempt to mimic what (from our observations) most maintainers did manually because:

  • It allows to reliably opt into it at any time (changes in numbering schemes may or may not require to wait for a version bump, which hasn't happened in years in some packages).
  • It can cater for snapshot & pre-releases and, if possible, ensures a clean upgrade path within the confines of current versioning policy.
  • Minor point really, but produced NVRs look familiar.
  • All of which: barrier of entry is really low, both re: technical and "psychological" reasons.
  • I dislike the idea of having multiple rebuilds per commit (are those distinguished by git tags? where would changelog information for those subsequent rebuilds come from?) and consider them an anti-feature

As @ngompa mentioned, one of the objectives we had last year was to allow exactly that, to make mass rebuilds easier. Anyway, this is more a policy discussion and kinda out of scope here: any technical scheme that can support multiple builds from the same commit can support single ones.

  • if we would vote on these kinds of changes based on the idea and not the implementation, we should have accepted the rpkg template / macro based approach as well - so implementation details obviously do (and should) matter, especially if they will determine how robust and future-proof the system will be

Do you mean rpkg-utils instead of rpkg? My main reason against this approach is because it layers one macro/templating language on top of another. This forces people to keep both in their heads when dealing with spec files, which is why I'm against it.

Re: @ngompa's comments:

Aside from the problems in counting from the commit graph, we also have another issue with using that: the Git data is not available during package build. Koji's architecture for package builds makes it impossible to convey the commit graph when the package build actually happens, because the derived number would be available when the SRPM is generated and then reset to zero when we build from the SRPM to produce the file RPMs. The reason for the Koji synchronization is because the Koji server itself can pass the value back to any stage at any time.

Yeah, we had to jump through some hoops to pass all necessary info from Koji into the build root. :wink:

I don't actually like the way we do this either. I wanted the Release: field just completely rewritten at SRPM generation time, rather than relying on a macro to be expanded at each SRPM-based RPM build run. That makes it so SRPM generation is stable without requiring a service to consistently give you data. Unfortunately, people didn't like that idea.

The release fields in the SRPMs produced by Koji are stable and can be built like any other, i.e. without access to koji, pagure and yield the same NEVR. The release numbers for the different use cases are determined before the SRPM is built in Koji and then hardcoded. It's not exactly implemented how you describe it, but should have the same outcome. [NB: only now noticed that @decathorpe mentioned this already, thanks!]

  • I dislike the idea of having multiple rebuilds per commit (are those distinguished by git tags? where would changelog information for those subsequent rebuilds come from?) and consider them an anti-feature

This is basically the whole point of this feature. Without being able to do this, there is no point in doing this. The goal is to make it possible for dependency rebuilds to not require human intervention, because not everyone is equipped to be able to do them (you need to be a provenpackager to do that).

The problem we have right now is that auto-rebuilds are spammy in Fedora infrastructure because they are considered special events. We need to make it so they aren't.

Another reason for the feature is to reduce conflicts in these fields between concurrent PRs, but I agree, making mass rebuilds easier is a huge benefit.

To be blunt, I don't like autogenerating the %changelog from Git at all. The only thing I want is the ability to have Koschei do ordered rebuilds when dependency drift occurs and submit it rather than me having to do it by hand.

I wonder why you object to auto-generating the %changelog as an optional feature, in many cases there's substantial overlap between what goes into the git commit log and what goes in the RPM %changelog. As with the release field, opting into automated changelog generation is optional and what was generated is hardcoded (and for this, actually replaced in the spec file).

@pingou @nphilipp This will be on the agenda of the meeting that happens today at 14:00 UTC

The Koji plugins will attempt to push the git tags it created to the dist-git repo, but won't fail on these steps.

This is exactly what I am worried about. It should fail the build if this fails, otherwise there will be divergence between koji build history and git tags.

  • It allows to reliably opt into it at any time (changes in numbering schemes may or may not require to wait for a version bump, which hasn't happened in years in some packages).
  • It can cater for snapshot & pre-releases and, if possible, ensures a clean upgrade path within the confines of current versioning policy.
  • Minor point really, but produced NVRs look familiar.
  • All of which: barrier of entry is really low, both re: technical and "psychological" reasons.

All the more reason for me not to like the current approach. It tries to do too much. Why try to accommodate all versioning styles that are in use in Fedora, instead ofusing a simple algorithm that just spits out an incrementing integer that can be plugged into the Release tag just like every other macro?

  • It allows to reliably opt into it at any time (changes in numbering schemes may or may not require to wait for a version bump, which hasn't happened in years in some packages).
  • It can cater for snapshot & pre-releases and, if possible, ensures a clean upgrade path within the confines of current versioning policy.
  • Minor point really, but produced NVRs look familiar.
  • All of which: barrier of entry is really low, both re: technical and "psychological" reasons.

All the more reason for me not to like the current approach. It tries to do too much. Why try to accommodate all versioning styles that are in use in Fedora, instead ofusing a simple algorithm that just spits out an incrementing integer that can be plugged into the Release tag just like every other macro?

Because that would only work if we were going to just do a rebuild counter. If we were just going to do only that, then we could just modify %dist to do incrementing build counters. Everyone wants to change to not managing Release and %changelog entirely, which necessitates quite a bit more.

We' talked about this on the meeting today.


I'd like FESCo members to stay their preference for one of the following three options. Please, don't be creative, pick exacly one option.

Option 1

The implementation proposed in the change proposal that allows building multiple builds from one commit but requires Koji to push/read git tags is preferred over the implementation from Option 2.

Option 2

The calculation of release number explained by @decathorpe that only reads the previous release/version and git commits history, does not require any tags and does not allow building multiple builds from one commit is preferred over the implementation from Option 1.

Option 3

No strong preference for either option 1 or 2.


On the meeting next week (or once all FESCo members stay their preference) we'll vote on approving/rejecting the most popular option from 1/2. If they are equally popular, we vote on option 1 (because that's what the change owners prefer).

  • It allows to reliably opt into it at any time (changes in numbering schemes may or may not require to wait for a version bump, which hasn't happened in years in some packages).
  • It can cater for snapshot & pre-releases and, if possible, ensures a clean upgrade path within the confines of current versioning policy.
  • Minor point really, but produced NVRs look familiar.
  • All of which: barrier of entry is really low, both re: technical and "psychological" reasons.

All the more reason for me not to like the current approach. It tries to do too much. Why try to accommodate all versioning styles that are in use in Fedora, instead ofusing a simple algorithm that just spits out an incrementing integer that can be plugged into the Release tag just like every other macro?

Because that would only work if we were going to just do a rebuild counter. If we were just going to do only that, then we could just modify %dist to do incrementing build counters.

On the top of that, there is the aspect of opting-in and opting-out. With the
simple approach, opting-out after opting-in is harder.

For completeness sake there is a fourth option:

Option 4

Reject this change proposal

I've deliberately not included this option. If a FESCo member wants to reject the proposal, they can still have a preference for option 1 or 2 in case it is actually approved. If they want to reject the proposal and don't have a preference, they select option 3. Thanks.

I choose option 2 (kind of obvious, but let's make it black on white).

I'm still leeary of this change, but of the two, I guess I choose Option 2.

Option 1: +4
Option 2: +4

By my count, that leaves @zbyszek as the tie-breaking vote. If he abstains, it becomes Option 1.

By my count, that leaves @zbyszek as the tie-breaking vote. If he abstains, it becomes Option 1.

That what one gets for going on vacation ;) I wanted to catch up on this earlier this week, but I only starting reading up on this today.

I'm strongly leaning towards option 2, but before voting, I'd like to have deeper understanding of the opposite opinion. Basically, it seems that we differing views on what is desirable, more on this below, which determines the final preference on the options. Those views are mostly independent from the implementation, and it seems that we have slightly different goals.

Here are my initial comments (mostly questions at this point):

  • "multiple rebuilds per commit" — why would we want to do this? It seems like an anti-feature to me. In particular

    The problem we have right now is that auto-rebuilds are spammy in Fedora infrastructure because they are considered special events. We need to make it so they aren't.

I consider the fact that I have an entry in the changelog that says "Rebuilt for F34" to be good thing. It tells me that the package got the new compiler settings and dependencies and whatnot. I certainly wouldn't want this to happen silently. What is currently bad is the fact that this automatic rebuild causes conflicts. But if we have %autorel[ease] and %autochangelog, this conflict will go away.

  • the changelog is generated from git history, but the release tag is not. This seems inconsistent — if we can do the former, why not also do the later?

I also think that the potential desynchronization between koji and dist-git is a significant problem. The infra is unreliable, and will always be, and even if this were to happen in one of every 10k builds, with enough builds it's going to become an issue. Having instead a "single source of truth" is a very desirable property.

About the implementation:

  • I think the approach of using changelog file is a genius idea, because it allows people to correct typos and adjust the changelog when appropriate post factum in a very natural way. But the change page has three different descriptions of how the changelog is generated. "git history of the spec file", "all commits made to the repo after the last change of changelog file", "all commits involving *.spec, *.patch", and neither of the three seems appropriate. For example, in my systemd repo, I have a triggers.systemd file which is quite important. If I modify this file and rebuild, I want to see the appropriate entry in the changelog. With the proposed filtering, I'd have to either make a fake change to .spec, or modify some filters. There is also a bunch of .conf, .py, .attr, and.prov` files which are also important. Overall, this filtering seems brittle and unnecessary. (If this is motivated by source-git type repos, then I think we should handle those in a custom way instead.)

It can cater for snapshot & pre-releases and, if possible, ensures a clean upgrade path within the confines of current versioning policy.

I don't consider this an important feature. "Current" versioning policy is something that needs to be deprecated, and we should switch to tilde-caret-versioning [https://pagure.io/packaging-committee/issue/904]. When this happens, the release field is finally free of versions shenanigans and stuff like %autorel[ease] becomes easier. Since the %autorel[ease]/%autochangelog stuff is opt-in, it would be totally OK to only allow this for packages which use don't use the deprecated versioning.

  • Release: %autorel vs. %changelog %autochangelog

One is abbreviated, the other not. This inconsistency is annoying. Even when talking about the two fields above, %autorelease is much more natural. Please make it Release: %autorelease.

  • Is stg-koji a thing? No package in Fedora seems to provide such a binary.

I consider the fact that I have an entry in the changelog that says "Rebuilt for F34" to be good thing. It tells me that the package got the new compiler settings and dependencies and whatnot. I certainly wouldn't want this to happen silently. What is currently bad is the fact that this automatic rebuild causes conflicts. But if we have %autorel[ease] and %autochangelog, this conflict will go away.

But this in itself is a problem, because it implies that in order to do a rebuild, you need write access to the repository. This is a major inhibitor for packagers who maintain library packages and such, because they cannot update them without going in and bumping each package. Since we don't grant packagers write access to all repositories by default, that means that this continues to be a problem.

Now, let's say we don't care about this problem. Fine, then what about the fact that it's absolutely utterly tedious to rebuild everything when a machine could do it for you? Updating a library and submitting it to Bodhi should trigger the automatic creation of a side-tag and trigger rebuilds for everything. If everything succeeds, then it's a pass and they all merge together. If there are failures, then it should be "stuck" until they're all fixed. But we can't do that as long as we have the restriction that we need people to have write access to repositories to do that.

This problem is a subtle reinforcement of the concept that packages are "owned" instead of "stewarded" by a maintainer. We have been trying for years to kill that off, and this is one of the major things left that promotes that mindset.

I don't consider this an important feature. "Current" versioning policy is something that needs to be deprecated, and we should switch to tilde-caret-versioning [https://pagure.io/packaging-committee/issue/904]. When this happens, the release field is finally free of versions shenanigans and stuff like %autorel[ease] becomes easier. Since the %autorel[ease]/%autochangelog stuff is opt-in, it would be totally OK to only allow this for packages which use don't use the deprecated versioning.

I am working on a PR to update our versioning guidelines to use snapshot data in Version field. This will not be an ongoing concern.

On IRC, @ngompa clarified that he meant rebuilds of packages after an incompatible change, for example an so-version bump. I see the point: with option 1, it becomes fairly easy to allow a bunch of packages to be rebuilt in a side tag, even when the packager does not have permission to write to the repo. But this doesn't convince me. In fact, I want every rebuild to have a commit message that tells my why the rebuild happened. And if we want to allow this, the scheme could not be opt-in, since all dependent packages would need to be ready to allow a rebuild. Overall, this would be significant departure from how we currently do things, and I don't think it is warranted.

OK, so Option 2 it is.

I'm generally not happy with the way changelog management works in rpmautospec. I would rather see a model where "final" generated changelogs are written back as annotations on tags/commits so that they can be edited later, rather than dealing with a changelog file that must be completely regenerated in order to edit older entries.

%autorelease is much more natural

I agree completely. I think things like this can be handled post approval as well.

I'm generally not happy with the way changelog management works in rpmautospec. I would rather see a model where "final" generated changelogs are written back as annotations on tags/commits so that they can be edited later, rather than dealing with a changelog file that must be completely regenerated in order to edit older entries.

I agree but not enough to block this.


Given that option 2 "won" slightly, let's vote on this:

Proposal: The change is approved with a modification described above as option 2.

Proposal: The change is approved with a modification described above as option 2.

+1

Proposal: The change is approved with a modification described above as option 2.

+1

Metadata Update from @churchyard:
- Issue untagged with: meeting

3 years ago

Proposal: The change is approved with a modification described above as option 2.

+1

"Just give me a macro that evaluates to an ever-auto-incrementing integer that I can stick wherever in the Release tag I think fits best" is both the easiest and the most flexible solution here, I think ... and also makes it very easy for people to opt-in, even if they use strange custom versioning schemes. It could even be integrated with the %baserelease support that's already there in some places.

After a week with the updated proposal, I count the vote as (+4,1,-0). Processing the change as approved with the following modification:

I would rather see a model where "final" generated changelogs are written back as annotations on tags/commits so that they can be edited later, rather than dealing with a changelog file that must be completely regenerated in order to edit older entries.

Metadata Update from @bcotton:
- Issue tagged with: pending announcement

3 years ago

I don't believe we have approved that modification.

Announced: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/SGAC6RWQXIQ62HFHB4SIG5HPJ2P3RPTP/

APPROVED (+4, 1, 0) with the following modification:
Provide a release number as a macro that evaluates to an
ever-auto-incrementing integer that can inserted somewhere in the
Release tag. Only read the previous release/version and git
commit history, do not require any tags, and do not allow building
multiple builds from one commit.

I hope that text reflects our discussion accurately. I spliced the relevant quotes from @decathorpe and @churchyard and changed the grammar and sentence structure to make it legible without context.

Metadata Update from @zbyszek:
- Issue close_status updated to: Accepted
- Issue status updated to: Closed (was: Open)

3 years ago

Metadata Update from @zbyszek:
- Issue untagged with: pending announcement

3 years ago

Login to comment on this ticket.

Metadata