#189 Longer changelog entries
Closed: Fixed 2 years ago by nphilipp. Opened 3 years ago by nphilipp.

Story Time

As a package maintainer using rpmautospec,

I want to be able to specify longer changelog entries that just the git commit log subject,

because they are limited in length by convention.

Acceptance Criteria

  • Git commit logs with the log subject line ending in and the following paragraph beginning with an ellipsis(*) should be merged for the RPM changelog entry.

Background

Git commit log messages usually adhere to a certain structure (see here for details): the "subject" which should be up to 50 characters long separated from the optional "body" (containing detailed information) by a blank line.

RPM changelog entries aren't limited in length and people often write changelog entries longer than 50 characters (for instance, to mention a Bugzilla ticket or to attribute someone else who contributed to the change).

(*) Ellipsis: Three dots indicating omission of something. For the purpose of this, I'd make it react to three (or more) literal dots (with or without separating white space), or the Unicode character (U+2026).


Metadata Update from @nphilipp:
- Issue tagged with: F35 Change, Needs Deployment, Nice to Have

3 years ago

This would be really useful. Right now to ensure bugs are attached in Bodhi I have to use commits like https://src.fedoraproject.org/rpms/python-TestSlide/c/10ab1db9296f081a26a17d57115d782e6749d4a2?branch=rawhide which work but are a bit ugly. I'd love to be able to put some extra lines or bullets in the body of the commit, and have rpmautospec automatically merge them in. So, for example:

Backport PR#308 for a typechecking issue

Fixes: RHBZ#1981718

would render as:

- Backport PR#308 for a typechecking issue
- Fixes: RHBZ#1981718

I'd like to avoid guessing whether or not parts of the commit log body should be added to the changelog entry, that's why I thought about using the ellipsis as kind of a continuation character. What do you think about that?

I am not sure if this a separate issue or another facet of this one:
I think everything that is written into Git log entry
should be carried over to the rpm changelog entry.

In my opinion, the usefulness of rpmautospec
is in that the need to think about two things,
1) Git changelog and 2) rpm changelog
is replaced with only one thing, Git changelog.
If the conversion is not really, really close to verbatim copy,
rpmautospec user still needs to think about
1) Git changelog and 2) conversion rules from Git changelog to rpm changelog.
Since it is completely normal to write multiline rpm changelog entries,
not supporting them in the conversion is a major deviation.

For me, this weakness is serious enough
so that have stopped converting my packages to rpmspec
until some resolution is found.

I'd like to avoid guessing whether or not parts of the commit log body should be added to the changelog entry, that's why I thought about using the ellipsis as kind of a continuation character. What do you think about that?

All of the commit log body should be added to the changelog entry, isn't this the whole point of rpmautospec?!

Nah, the level of detail in git commit log is different from what you want in an RPM changelog.

Git commit logs often have different sections with different purposes, the subject (what was changed?), the body (why and how?), signed-off-by lines – see https://chris.beams.io/posts/git-commit/ for more info.

Today, we only use the "commit log subject" for the RPM changelog, i.e. what is changed, but we want to give people a way to exceed the limits put on subject lines by convention, i.e. only the first line of a mere 50 characters, so more detail (like fixed BZ tickets or similar) can be included.

I agree very much with @oturpe and @eclipseo above… I'd go even further and say that current behaviour is borked from both sides: both too much and too little is included in the generated changelog.

As mentioned, trimming everything after the first line doesn't make any sense. The first line is not special. No special syntax is necessary to specify that the body should be included: just include the whole thing please until I say otherwise.

Example from my package (with manual changelog):

* Mon Mar 22 2021 Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> - 248~rc4-3
- Fix hang when processing timers during DST switch in Europe/Dublin timezone (#1941335)
- Fix returning combined IPv4/IPv6 responses from systemd-resolved cache (#1940715)
  (But note that the disablement of caching added previously is
  retained until we can do more testing.)
- Minor fix to interface naming by udev
- Fix for systemd-repart --size

I expect that I can commit this as

Version 248~rc4-3

- Fix hang when processing timers during DST switch in Europe/Dublin timezone (#1941335)
- Fix returning combined IPv4/IPv6 responses from systemd-resolved cache (#1940715)
  (But note that the disablement of caching added previously is
  retained until we can do more testing.)
- Minor fix to interface naming by udev
- Fix for systemd-repart --size

and get a autogenerated changelog entry that looks pretty close to the manual one.

Git commit logs often have different sections with different purposes, the subject (what was changed?), the body (why and how?), signed-off-by lines

That's partially true, but there is no distinction that the body is not relevant for users. Sometimes it might not be, but as the example above shows, there is no reason to assume that.

Signed-off-by lines are completely unnecessary in Fedora, because all contributions require agreement when the FAS account is created, so all contributions are implicitly licensed. I know that there are people who like to insert those, but it serves no purpose. But anyway, please just filter those lines out of the autogenerated changelog. (I'd also filter out any lines matching /\(cherry picked from [a-f0-9]+\)/.)

OTOH, we need commits that are NOT included in the changelog. (For example, if I commit 'Cleanup whitespace in the spec file' I really really don't want users to see this). So I think it should be possible to annotate commits or parts of commits for filtering out of the changelog.

For example:
(no changelog) in the commit body to completely filter out the commit from the autogenerated changelog, and no changelog: to filter out the part starting at that line. This would allow the "why and how" parts to be included if appropriate.

My concerns would be resolved by @zbyszek's proposal
of including everything by default,
but having keywords like no changelog:
to exclude parts of the commit message.
That way, I could still write normal multi-line changelog entries
and could not accidentally create a truncated changelog entry,
which were my concerns.

I often include longer messages in commits, especially backported from upstream. I would prefer to not include always whole commit message. I do not think version should be included inside the commit but computed from last build (tag?).

I would propose first paragraph delimited by empty line would be included. I think empty lines should not be part of changelong anyway. With option to add more paragraphs by special symbol at the end.

Those both would be accepted:

- Backport PR#308 for a typechecking issue
- Fixes: RHBZ#1981718
- Fix hang when processing timers during DST switch in Europe/Dublin timezone (#1941335)
- Fix returning combined IPv4/IPv6 responses from systemd-resolved cache (#1940715)
  (But note that the disablement of caching added previously is
  retained until we can do more testing.)
- Minor fix to interface naming by udev
- Fix for systemd-repart --size

With manual request to include more paragraphs. Commit

Very important change (#1234) *2

Also important note to changelog

Last note to changelog

Details for dist-git log only...

*2 would mean 2 next paragraphs should be included. ** might mean whole commit. Changelog would render as:

- Very important change (#1234)
- Also important note to changelog
- Last note to changelog

Maybe parameter to %autochangelog might change default number of paragraphs included, so maintainers do not have to always type special characters and commit according to their wishes.

The summary line in git commits is "prime real estate". It is shown in abbreviated views like gitk, gitg, git log --pretty=oneline, and we should not make the developer experience worse by inserting some unrelated information into this line. (The fact that there is some paragraph in the output that should be processed in some way is irrelevant when we are showing summaries.)
Thus, I very much dislike any proposal which wants to add annotations in the summary (ellipsis as suggested by @nphilipp or */2 as suggested by @pemensik), sorry.

I also think this would to bad user experience: people would edit the text below, and then forget to adjust the summary line. Remember that there is no "preview" for this…

I would propose first paragraph delimited by empty line would be included. I think empty lines should not be part of changelong anyway.

That is a neat idea. Nevertheless, it doesn't seem to work well in practice: there will often be commits where the first paragraph should not be included in the changelog. (Stuff like "Update to new version.\n\nAlso change tabs to spaces in the spec file." or "Update to new version.\n\nNote to self: this version should not be backported to F34.".) So maintainers would need to learn and use the syntax to omit the paragraph from the changelog. Also, note that sometimes the whole commit should be omitted from the changelog as discussed above, which becomes awkward with this syntax. And finally, the assumption that it's just the first paragraph that matters is unwarranted. There certainly are changelogs with longer descriptions. We format them as one paragraph with a list of items after "-", but that's just our idiosyncratic formatting in the %changelog. (Vide rpm -q --changelog gcc or rpm -q --changelog systemd for examples of multi-para entries). In the commit message it would be natural to write them as paragraphs, and only reformat into the list for the %changelog.

Also, the format should be dead simple and self-explanatory. Anything with special rules and magical formatting symbols that work at a distance is (IMO) the wrong approach.

no "preview" should be easily fixed by the tool. Displaying of unpushed commits only and how they would generate changelog once pushed and built should be not hard task. I often use gitk before pushing. I would be centainly interested in ability to view generated stuff before pushing. Not sure it is already possible.

My syntax would allow ommitting whole commit by appending *0 for example. I guess it is not that offending.

Anyway, the more I think about it, it seems details should not be included in changelog. Just first line would be usually okay, except related bug number might be attached to it from commit body. I think it would be quite cool, if built package would contain URL to build specific history with any details. I expect most people consult changelog usually to find some tracker number for more details anyway. As long as details link exist, changelog lines do not have to be too detailed.

For example, we have public changelogs on dist-git available to public, but it is not easy to track which build is related to which commit given build links. It would be nice, if build process could insert build source link somewhere into rpm package.

Example with systemd-249.1-1.fc35:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1782213
https://src.fedoraproject.org/rpms/systemd/c/c61b9c5d29e906f346ac080c2a03fde3f84d40b0?branch=rawhide
https://bodhi.fedoraproject.org/updates/FEDORA-2021-fe1baabd93

We have no simple place to create details link from changelog. Git can hold any details possible. We have clickable interface for it. We have multiple places for details, but none of them is available from dnf info systemd or rpm -qi systemd. I think single line would be usually enough, IF there was simple way to click single URL, leading to more hyperlinks. Some form of standardized bugzilla numbers in commits would certainly help. Best with URL built into the package. Is not long plain-text only somehow obsolete these days?

Would be the first line/paragraph enough, if only supported metadata like bug references were extracted from following message? That is what @dcavalca mentioned anyway.

What I have in mind is something like BIND9 Release notes has. Short summary with links to details. Can we make our package history similar to that?

As a practical compromise, can we:

I do not like Resolves: or Fixes: in changelog. Could we support also just (#bugnumber) format, as used by some developers including glibc? Could be checked against bugzilla as bug changed less than 2 months ago or something similar to rule out random unrelated numbers.

* Fri Jun 18 2021 Carlos O'Donell <carlos@redhat.com> - 2.28-161
- Improve POWER10 performance with POWER9 fallbacks (#1956357)

I would not be against commit message generating this was:

Improve POWER10 performance with POWER9 fallbacks

Bla bla bla about why was it done this way.

Fixes: rhbz#1956357

I think explicit keywords fixes or resolves in changelog makes it more ugly.

As a practical compromise, can we:

This doesn't work for me at all. I need to be able to include lengthy explanations in the changelogs in my packages.

A compromise that I'd find workable would be to invert the condition, and require an explicit annotation to include the text:

commit:
Some subject line

changelog: Fixes the issue with foobar by doing bar bar and
aasdfsadf  fasdfsa  asd fsad f sadf dsa f dsaf dsa fas df dsa fads.
See https://fp.o/wiki/Change/something.
→
%changelog
* Fri Jun 18 2021 ... - 2.28-161
- Some subject line
- Fixes the issue with foobar by doing bar bar and
   aasdfsadf  fasdfsa  asd fsad f sadf dsa f dsaf dsa fas df dsa fads.
   See https://fp.o/wiki/Change/something.

It would work for me, but I think the opposite is still better: it's just less surprising to include everything by default. With opt-in, the formatting would also be negatively affected (the line with the annotation would be shorter after the annotation is stripped out, unless the maintainer remembers to make it extra long in the commit message). Also, I think opt-in is inferior because people who want to write very short changelogs would usually also write very short commit messages. So we'd be optimizing the process for people who wouldn't use the result anyway.

Also, some way to annotate no-changelog commits is also necessary. You didn't address this at all.

Also, some way to annotate no-changelog commits is also necessary. You didn't address this at all.

That is because I consider that a separate issue: https://pagure.io/fedora-infra/rpmautospec/issue/206

If Fixes: format metadata is tested, i think Changelog: (skip|ignore) could be sufficient, with similar look & feel.

What if we detected just ^%changelog$ in commit message and attach anything following it as preformatted changelog? It would not require any formatting logic and is simple enough to use and implement. If one needs to override just subject line?
Still saves writing the same thing multiple times. I think subject line copy is small change, depending on you want it there or not.

commit:
Some subject line

%changelog
- Some subject line
- Fixes the issue with foobar by doing bar bar and
   aasdfsadf  fasdfsa  asd fsad f sadf dsa f dsaf dsa fas df dsa fads.
   See https://fp.o/wiki/Change/something.
→
%changelog
* Fri Jun 18 2021 ... - 2.28-161
- Some subject line
- Fixes the issue with foobar by doing bar bar and
   aasdfsadf  fasdfsa  asd fsad f sadf dsa f dsaf dsa fas df dsa fads.
   See https://fp.o/wiki/Change/something.

What if we detected just ^%changelog$ in commit message and attach anything following it as preformatted changelog?

I like this.

Let's also make '%changelog: skip' skip the changelog entry, solving #206 at the same time.

Do I need to repeat "Some subject line"? E.g. does using %changelog drop the subject line from the changelog?

Let's also make '%changelog: skip' skip the changelog entry, solving #206 at the same time.

In fact, it would be sufficient if we require changelog to contain non-whitespace and then add %changelog to the commit message end. A bit hackish; Changelog: skip would look better together with Signed-off-by: and similar commit tags.

Do I need to repeat "Some subject line"? E.g. does using %changelog drop the subject line from the changelog?

Yes, I would make it possible to omit the subject line from changelog this way. Would work both for those who want it there and who do not. I would just add starting "- " if not already starting with it. Perhaps %changelog add might include subject line, saving manual copy & paste.

In this model, would something like:

commit:
do some stuff

Closes: rhbz#12345

do the right thing? Or would I need to use %changelog explicitly in the commit message?

Including everything after %changelog verbatim collides with Signed-off-by: and similar lines which are at the end of the commit log message. In order to not stand the current behavior on its head, I'm leaning towards something like this (magic keywords to skip changelog notwithstanding):

  • the subject line will always be included
  • have a continuation character if whatever the maintainer wants in the RPM changelog doesn't fit in the subject line (50 chars) without having to violate "customs" -- you're always free to violate them nonetheless :smiley:
  • include lines that begin with a dash or subsequent lines that are indented

This isn't fully fleshed out because I want to read through the conversation properly which I didn't had time for today.

do the right thing? Or would I need to use %changelog explicitly in the commit message?

%changelog is proposed as override to default behaviour of subject + bug number extraction. Your example still should result into your output, unless you used override. Should not be required always, on the contrary. Unless we update bodhi to recognize (#1234) at the end of line as RHBZ id, Closes: rhbz#12345 should be appended after comma or new line dash as it is.

Including everything after %changelog verbatim collides with Signed-off-by: and similar lines which are at the end of the commit log message.

Sure, that is why I proposed to put it into the end. I doubt anyone would ever want Signed-off-by: in changelog, so %changelog override should be after them. I expect cherry-picks including %changelog done by different person without his own Signed-off-by: already in commit message would be quite rare. So yes, in that unlikely case, you would have to move your signature above %changelog manually. Would it be problem on any package you know?

Okay, realized if I use git commit -sF <textfile>, it would create it non-interactive. As a workaround, %changelog would terminate not always at the end of commit message, but at the first empty line following it. Would work well with Signed-off-by:, right?

If followed by empty line, it would solve also issue #206.

How about taking "- " at the beginning of a lne as the distinguisher? For example, the log message

subject line

message body
several lines

would end up in the changelog as

- subject line

as it is now, but a log message

subject line

We explain some reasoning for fellow packagers. This leads to some changes.
- some short change description
- maybe a bz#08150815
We can still explain more.

ends up in the changelog as

- some short change description
- maybe a bz#08150815

This should be mostly backwards compatible, fully flexible, allows overriding the commit subject (because only dashed lines are taken if there is one) and has no issues with trailers at all. Also, the commit log remains legible.

How about taking "- " at the beginning of a lne as the distinguisher?

That could work too… But how do you mark commits which shouldn't end up in the changelog at all?

```
subject line

We explain some reasoning for fellow packagers. This leads to some changes.
- some short change description
- maybe a bz#08150815
We can still explain more.
ends up in the changelog as
- some short change description
- maybe a bz#08150815
```
This should be mostly backwards compatible, fully flexible, allows overriding the commit subject (because only dashed lines are taken if there is one) and has no issues with trailers at all. Also, the commit log remains legible.

I would suggest requiring at least empty line before and after changelog entry marked just by - at the start of line. To avoid uninteded parts of commit message landing in changelog.

subject line

We explain some reasoning for fellow packagers. This leads to some changes.

- some short change description
- maybe a bz#08150815

We can still explain more.

The above makes parsing relative easy, but with lower probability random message part would land into changelog without desire for it.

Metadata Update from @nphilipp:
- Issue assigned to nphilipp

3 years ago

Yeah but empty lines are easily missed (to be inserted or left out) and this difference in behaviour will not very obvious. So here's what I'll implement so people can have longer changelog lines and entries with more than one item:

  • If the subject line isn't long enough for the first item, you can continue it in the body if it begins with an ellipsis ( or ..., i.e. the Unicode ellipsis character or three or more dots).
  • If you want more than one entry, add them prefixed with a dash to the top of the body.

E.g. this commit log…:

fix a buffer overrun in the JPEG file loader

...(rhbz#1234567)
- rebuild against libfoo-1.23
- remove compat cruft for ancient Fedora versions

Further details which have no business being in the changelog.

Signed-off-by: Nils Philippsen <nils@redhat.com>

…would generate a %changelog entry like this:

* Fri Jan 14 2022 Nils Philippsen <nils@redhat.com> - 3.45-5
- fix a buffer overrun in the JPEG file loader (rhbz#1234567)
- rebuild against libfoo-1.23
- remove compat cruft for ancient Fedora versions

Sounds OK. But please allow for entries that have multiple lines, i.e. if the next line after the line with a dash is indented, it should be part of the line with the dash.

* Fri Jan 07 2022 Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> - 250.2-1
- Second stable release after v250: various bugfixes
  (systemd-resolved, systemd-journald, userdbctl, homed).
- The manager should now gracefully handle the case where BPF LSM
  cannot be initialized (#2036145). The BPF filters are enabled again
  on all architectures, so *other* filter should also work on the
  affected architectures.
- kernel-install now checks paths used by grub2 before sd-boot paths again
  (#2036199)
- fstab-generator now ignores root-on-nfs/cifs/iscsi and live (#2037233)

I am not sure how smart line breaking can we reuse. But I would propose using changelog formatted message keeping original commit message format. If it had new lines, just keep that new lines also in actual changelog in rpm package. It would save us making smart enough formatting and allow hand made formatting preserved.

But please allow for entries that have multiple lines, i.e. if the next line after the line with a dash is indented, it should be part of the line with the dash.

Of course, my plan is to include all lines until the next block that's separated by an empty line, only lines beginning with a dash would become a new "entry item".

I am not sure how smart line breaking can we reuse. But I would propose using changelog formatted message keeping original commit message format. If it had new lines, just keep that new lines also in actual changelog in rpm package. It would save us making smart enough formatting and allow hand made formatting preserved.

I have never come across a changelog that needed verbatim line formatting, can you point me to an actual example? I'm wondering what people would use it for…

Because some things have to be quoted in the RPM changelog, rpmautospec will potentially make lines longer and the simplest and most predictable way to do this is to wrap lines to a fixed length. This will also be more or less needed so that "long first items", i.e. those that are continued from the git commit subject in the body, won't look ugly next to the rest of the changelog entry. E.g. I don't want this …:

This is my almost 50 characters long subject line

...but it's not enough because there's so much more I can say here!
- And there's even more, I've worked on so much which I don't want to
split across several commits. (Fixes: #1234567)

… to become this …:

* Wed Jan 19 2022 Nils Philippsen <nils@redhat.com> - 1.2-3
- This is my almost 50 characters long subject line
  but it's not enough because there's so much more I can say here!
- And there's even more, I've worked on so much which I don't want to
  split across several commits. (Fixes: #1234567)

… or even this …:

* Wed Jan 19 2022 Nils Philippsen <nils@redhat.com> - 1.2-3
- This is my almost 50 characters long subject line
  but it's not enough because there's so much more I can say here!
- And there's even more, I've worked on so much which I don't want to
split across several commits. (Fixes: #1234567)

… but rather that:

* Wed Jan 19 2022 Nils Philippsen <nils@redhat.com> - 1.2-3
- This is my almost 50 characters long subject line but it's not enough
  because there's so much more I can say here!
- And there's even more, I've worked on so much which I don't want to split
  across several commits. (Fixes: #1234567)

What I had on mind were indentation made by @jskarvad on tuned package. I think it should be possible to prepare similar changelog entry from commit message, even when it is much longer than is common. Similar complex formatting is done by DNS-OARC on dnsperf. They don't even use - for 2nd level indentation, maybe punct char should be allowed with leading any space count as a marker.

Both use space indentation explicitly and I think it should allow such preformatted commits.

Could be mode of line processing configured by %autochangelog macro?

I don't think the commit message over there is complete. For example, you seem to drop a "-" from the first line of a commit message irrespective of what processing comes next. It is a good change which cleans up some of the mass rebuild commit messages.

Also, how do you add a second item if you don't continue the first item (the one from the first line of the commit message aka commit subject)? From the code it seems you'd have to add an ellipsis just to get into the right parsing mode.

All in all, this seems to turn the "commit subject" (1st line of the commit message) into an item just like any item, which is quite different from the typical usage in git land. Or am I missing a way to omit the commit subject from the changelog items?

Log in to comment on this ticket.

Metadata
Boards 1
Delivery Board Status: New