#719 Simplify packaging of forge-hosted projects
Opened 2 years ago by nim. Modified 3 months ago

I'd like FPC to consider simplifying the packaging of forge-hosted projects.

Currently they require cut and pasting a lot of boilerplate from guidelines, which is tedious, error-prone, heavy maintenance when the forges change their interfaces, and makes specfile reading awful, hiding other mistakes.

Some software ecosystems like Go are so forge-oriented they require another level of boilerplate with about the same variables (but for some reason different naming) with the same detrimental side-effects.

Therefore I'd like Fedora default macros to gain a %{_forge_metadata} macro that sets sane defaults and defines all the common variable forge packager uses in one form or another.

Usage examples, on top of spec

Exemple A:
%global forge github
%global project sirupsen
%global repo logrus
Version: 1.0.3
%_forge_metadata

Exemple B:
%global forge github
%global commit 405dbfeb8e38effee6e723317226e93fff912d06
%global project alecthomas
%global repo assert
%_forge_metadata

→ URL, Source, Version, and archivename are auto-computed, with Go we would also auto-compute the package Name

Macro contents

%_forge_metadata() %{expand

%{?forge:
%if "%{forge}" == "github"
%{!?url:URL: https://github.com/%{project}/%{repo}/}
%{?commit: %global shortcommit %(c=%{commit}; echo ${c:0:7})
%{!?archivename: %global archivename %{repo}-%{commit}
%{!?source:Source: %{url}archive/%{commit}.tar.gz#/%{archivename}.tar.gz}
%global scm git}
%{?version:
{!?archivename: %global archivename %{repo}-%{version}}
%{!?source:Source: %{url}archive/v%{version}.tar.gz#/%{archivename}.tar.gz}}
%endif

%{!?version:Version: 0}}
%global sdist %{?scm:.%{scm}%{shortcommit}}%{?dist}

}

Only github definitions here since that's the only forge I care about for Go but the macro can be trivially completed over time. Values are passed by globals not macro switches since most of them are commonly used in other specfile parts (I will need them for Go macros).

It would be nice if Fedora default templates used sdist instead of dist or if dist was redefined itself this way.

It would be nice if %setup was changed %setup -n %{archivename} automatically if archivename is defined and the packager didn't specify a %setup switch (but that's over my rpm-fu)

Benefits for Fedora: simpler specs, less human errors, single place to fix when a force changes its interface, normalisation of common variables, manual guidelines diet

Benefit for me: don't worry about generic stuff when working on Go package specifics

Drawbacks: very nice for new specs, but draining the swamp is going to be awful, making the macros available (in EPEL too please) is a clear win, making it mandatory less so


Does this work with my use of spectool to extract source URLs?

The posted macro file works fine for me in rawhide

It does not work in EL 7 because github urls require the use of #, and escaping # in macros requires a different syntax in EL7 rpm/spectool (IIRC EL7 wants to sprinkle them with % not ). The usual EL PITA.

Nothing that can't be workarounded by shipping different macros in EL7 and rawhide, of by convincing rpm/spectools authors to support the EL7 syntax in rawhide, or the reverse, or all of this. I didn't bother reporting the problem yet, that would work better if FPC asked :).

Apart from this I've used the macro on 19 spec files, through another Go-specific macro that adds more magic, and the results builds successfully for EL7 in EL7 mock, and EL7 + rawhide on rawhide mock.

$ rm ../SOURCES/gz
$ for spec in
spec ; do spectool -C ~/rpmbuild/SOURCES/ -g $spec ;done
Getting https://github.com/alecthomas/assert/archive/405dbfeb8e38effee6e723317226e93fff912d06.tar.gz#/assert-405dbfeb8e38effee6e723317226e93fff912d06.tar.gz to /var/lib/builder/rpmbuild/SOURCES/assert-405dbfeb8e38effee6e723317226e93fff912d06.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 159 0 159 0 0 159 0 --:--:-- 0:00:01 --:--:-- 139
100 71099 100 71099 0 0 35549 0 0:00:02 0:00:02 --:--:-- 296k
Getting https://github.com/alecthomas/colour/archive/60882d9e27213e8552dcff6328914fe4c2b44bc9.tar.gz#/colour-60882d9e27213e8552dcff6328914fe4c2b44bc9.tar.gz to /var/lib/builder/rpmbuild/SOURCES/colour-60882d9e27213e8552dcff6328914fe4c2b44bc9.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 159 0 159 0 0 159 0 --:--:-- --:--:-- --:--:-- 255
100 3483 100 3483 0 0 3483 0 0:00:01 0:00:01 --:--:-- 7757
Getting https://github.com/alecthomas/repr/archive/1f6b5f08eba408b4ffcafb125ccfb99fbc44de7f.tar.gz#/repr-1f6b5f08eba408b4ffcafb125ccfb99fbc44de7f.tar.gz to /var/lib/builder/rpmbuild/SOURCES/repr-1f6b5f08eba408b4ffcafb125ccfb99fbc44de7f.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 157 0 157 0 0 157 0 --:--:-- --:--:-- --:--:-- 267
100 4272 100 4272 0 0 4272 0 0:00:01 0:00:01 --:--:-- 695k

That should read IIRC EL7 wants to sprinkle # with % not backslashes

BTW, skimming the proposed ruby guidelines, they state

%prep
%setup -q -n %{gem_name}-%{version}

That could be trivially automated if Fedora's %setup macro did %setup -n %{archivename} automatically when %{archivename} is defined. In fact any ecosystem with strong release archive conventions could be automated.

(of course one could bikeshed the naming to death, I don't care what's chosen as long as the technical possibility exists)

Ok, I played with the macro long enough (in scores of Go spec files), I rewrote everything that was painful, it is solid enough for common use, I'm now proposing it for inclusion with the corresponding packaging draft

https://fedoraproject.org/w/index.php?title=Automated_forge_handling

And the corresponding fedora-rpm-macros RFE

https://bugzilla.redhat.com/show_bug.cgi?id=1523779

(Note: the macro file is heavily used by the Go packaging templates I'm currently finishing, this is a spinoff of the non-Go-oriented generic parts)

Upon first look, this looks extremely good. Thanks for all the hard work. I added the missing tags (they're under Edit metadata button).

Metadata Update from @rathann:
- Issue tagged with: hasdraft, meeting

a year ago

I think the handling of the %version case needs to be improved. There are many packages on github that do not follow the v%{version}.tar.gz tagging pattern. For me most of the work when creating a spec file related to dealing with various ways of naming the archive, including the top level directory in the archive itself.

Take the example of a standard and non-standard archive naming :

wget -q https://github.com/DATA-DOG/godog/archive/v0.6.2.tar.gz # top-level dir: godog-0.6.2
wget -q https://github.com/briandowns/spinner/archive/1.0.tar.gz # top-level dir: spinner-1.0
wget -q https://github.com/satori/uuid/archive/v1.0.0.tar.gz # top-level dir: go.uuid-1.0.0

Let's include more options to handle these exceptions (if this is not handled by the proposed macros yet).

This initiative seems inspired by https://github.com/gofed/gofed, why don't we use the fields as defined by gofed, like provider, provider_tld, etc?

Hi Marcin,

Your first example works fine with

%global forgeurl https://github.com/DATA-DOG/godog
Version: 0.6.2
%forgemeta

The second one works with

%global forgeurl https://github.com/briandowns/spinner
Version: 1.0
%global tag %{version}
%forgemeta

Case already documented here
https://fedoraproject.org/wiki/Forge-hosted_projects_packaging_automation#Packaging_a_release_accessed_through_a_tag
(git people can't agree on release tag naming, I already complained at git upstream but they don't care and don't want to define the conventions automation would need. The default macro release tag value for GitHub is v%{version} since that seems more popular.)

And the last one works with
%global forgeurl https://github.com/satori/uuid
Version: 1.0.0
%global archivename go-uuid-%{version}
%forgemeta

One needed to care about forgemeta ordering before but I just fixed this. Note that that you don't actually need this override since https://github.com/satori/uuid/ redirects to https://github.com/satori/go.uuid/ and the following works:

%global forgeurl https://github.com/satori/go.uuid
Version: 1.0.0
%forgemeta

So the macro just works in most cases and allows overrides for non-standard cases.

I don't want to write here about gofed, I gave my opinion on the specs it generates elsewhere, let's just say I strongly disagree about many of its choices and the result in Fedora. gofed could be a wonderful tool to analyse Go code and untangle Go dependencies, right now it's a poor spec generator that makes a mess of Fedora Packaging Guidelines.

I will propose my variant of Go packaging once this generic non-Go part is done. It reuses a lot of the fine work of the Go SIG but dumps gofed-style specs in the scrapbucket.

For example satori UUID would become (yes I already had this one done). %gometa builds on %forgemeta

%global goipath github.com/satori/go.uuid
Version: 1.1.0
%gometa

Name: %{goname}

URL: %{gourl}
Source: %{gosource}

%prep
%setup -n %{archivename}

%install
gofiles=$(find . -iname "*.go" \! -iname "*_test.go" -print)
%goinstall $gofiles

%check
%gochecks

%files devel -f devel.file-list
%license LICENSE
%doc *\.md

I included the https://github.com/satori/uuid/ vs https://github.com/satori/go.uuid/ example specifically to show that cases like this one need to be handled by the new macros.

That's why I explained how your cases could be handled even though https://github.com/satori/uuid/ is a non-problem in real life :).

I hope you're reassured.

It does not work in EL 7 because github urls require the use of #, and escaping # in macros requires a different syntax in EL7 rpm/spectool (IIRC EL7 wants to sprinkle them with % not ). The usual EL PITA.

According to our own guidelines, the '#' in github urls is no longer necessary and should not be used because you can say, e.g.:

Source0:  https://github.com/OWNER/%{name}/archive/%{commit}/%{name}-%{shortcommit}.tar.gz

This is from: https://fedoraproject.org/wiki/Packaging:SourceURL#Git_Hosting_Services

Does this allow you to make it work on EL7 too?

We discussed this at this weeks meeting (https://meetbot-raw.fedoraproject.org/fedora-meeting-2/2017-12-13/fpc.2017-12-13-18.00.txt):

  • x719 Simplify packaging of forge-hosted projects (geppetto, 18:26:29)
  • Macros need more thought, cleaned up and how to handle corner cases.
    Maybe look at using parameters. (geppetto, 18:42:02)
  • It might just be trying to be too smart in general though.
    (geppetto, 18:42:39)

According to our own guidelines, the '#' in github urls is no longer necessary and should not be used because you can say, e.g.:
Source0: https://github.com/OWNER/%{name}/archive/%{commit}/%{name}-%{shortcommit}.tar.gz

The guidelines are wrong that URL won't work

https://github.com/OWNER/%{name}/archive/%{commit}/%{name}-%{commit}.tar.gz

will work, however, not that it changes a lot in the general case (the macro is not a "github" macro), and not that github can not decide to change again later. EL7 has a bug somewhere in the spectool chain, that's not fixing the bug that's trying to avoid it

Does this allow you to make it work on EL7 too?

I'll need to retest, that's only one case out of many, that's why %forgemetacheck is included, it works everywhere including el7, and is only sightly more work than using spectool directly

x719 Simplify packaging of forge-hosted projects (geppetto, 18:26:29)
Macros need more thought, cleaned up and how to handle corner cases.

The macros already handle all the corner cases encountered in the real world and invented during review (even contrived synthetic corner cases)

I'm not against them being cleaned up some more, but someone else will have to do it

I've already rewritten them several times and they are way above the required quality level. Most other macros do not even bother about error handling

According to our own guidelines, the '#' in github urls is no longer necessary and should not be used because you can say, e.g.:
Source0: https://github.com/OWNER/%{name}/archive/%{commit}/%{name}-%{shortcommit}.tar.gz

The guidelines are wrong that URL won't work
https://github.com/OWNER/%{name}/archive/%{commit}/%{name}-%{commit}.tar.gz
will work, however,

And I've switched the macro to this (if Fedora used the macro that would fix instantaneously all the specfiles that relied on the obsolete and not working GitHub guideline)

spectool in EL7 still ignores the source so there is another bug lurking in EL7 that needs fixing

That's not a huge problem because the macro will now output any variable it sets. So on EL7:

$ rpmbuild -bs golang-x-debug/golang-x-debug.spec
Setting %{forgeurl } = https://github.com/golang/debug
Setting %{archiveext} = tar.gz
Setting %{forgesetupargs} = -n %{archivename} [ -n debug-%{commit} ]
Setting %{scm} = git
Setting %{shortcommit} = f11d3bc
Setting %{archivename} = debug-%{commit} [ debug-f11d3bcfb62fc8e5d737acc91534fad5e188b8d4 ]
Setting %{archiveurl} = %{forgeurl}/archive/%{commit}/debug-%{commit}.%{archiveext} [ https://github.com/golang/debug/archive/f11d3bcfb62fc8e5d737acc91534fad5e188b8d4/debug-f11d3bcfb62fc8e5d737acc91534fad5e188b8d4.tar.gz ]
Setting %{forgesource} = %{forgeurl}/archive/%{commit}/debug-%{commit}.%{archiveext} [ https://github.com/golang/debug/archive/f11d3bcfb62fc8e5d737acc91534fad5e188b8d4/debug-f11d3bcfb62fc8e5d737acc91534fad5e188b8d4.tar.gz ]
Setting %{dist } = .%{scm}f11d3bc.el7.llt [ .gitf11d3bc.el7.llt ]
Écrit : /home/rpm/rpmbuild/SRPMS/golang-x-debug-0-0.2.gitf11d3bc.el7.llt.src.rpm

And you only need to cut and paste %{forgesource} in wget or curl to get going

And that's all for the consolidations and cleanups I can do my side. Someone else will have to go further if FPC wishes it

For what it's worth, these macros are currently available in rawhide.

Also for what it's worth, now Panu is unhappy that these macros are now in rawhide after a month of the bugzilla ticket being open, so I don't know what's going to happen. Everything I do seems to piss someone off but I prefer slightly imperfect progress over complete stagnation. Still, it's possible I went too far here.

Personally I'm not entirely happy with the macros either, but I felt that we can still work on them even if that means we introduce incompatibilities.

I of course welcome any input.

tibbs, for what it's worth your action is appreciated

I think there is a huge impedance mismatch between traditional rpm view where git and commits do not exist, or only exist in the margin, so it's not a biggie it it takes 10 minutes to move a spec to a new commit or from commit to release, and current developments practices of git-centric projects such as docker

Right now I'm at ~194 packages using commits (and I converted a lot of others from commits to releases when upstream had finally cut a release) and the package baseline is still not complete enough to get consul or docker to build. No one will ever package those properly if it requires passing parameters manually in 200 packages or more. They just decide rpm-centric packaging is idiotic and move to some other tech

Ok, now that tibbs did the technical part and the macros are in redhat-rpm-config, can anyone from FPC take charge of the packaging draft?

We discussed this at this weeks meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2018-04-12/fpc.2018-04-12-16.00.txt):

Is the "corner case" where upstream has released a version, but a snapshot has to be packaged, handled by the current state of the macros now?

Or, asked differently: Can the macros produce a package with values for Version and Release like these, as mandated for post-release snapshots by Packaging Guidelines?

Version: 2.1
Release: 3.20180426.gitabcdef0%{?dist}

I remember that I tested this once by setting both the Version: tag and the %commit global before calling the %forge* / %gometa magic macro, and it didn't work as expected. Has this changed?

  1. this ticket is only about the forge part, go specifics is another one

  2. as shown in the examples and emphasized in the wiki page the macro recomputes %dist, so it's 100% up to the packager if release is 0.something%{?dist} or notzerosomething%{?dist}, it's not something the macro cares or ever cared about (it's already hard enough to map a version/commit to a downloadable archive without trying to guess its ordering WRT other version/tag/commits, not to mention its ordering WRT other package releases). All the macro does is injecting the correct date and tag/commit hash in dist, everything before is still handled by the packager and not really automatable

Ok, thanks for clarifying.

However, I'm not so sure the "injecting the correct date" thing is accurate.

Building a package on different days injects different dates, despite the source being the same. Even when koji rebuilds the source package it leads to a discrepancy between the newly generated release tag and the old generated changelog entry. (Look at my recently built https://koji.fedoraproject.org/koji/buildinfo?buildID=1078636 for an example.)

I really don't think those macros should "guess" or insert the current date at all. Just have the packager specify the correct commit date.

Additionally, this currently proposed method is completely not reproducible. Building the source package on different days (or even in different timezones) produces different results. That's not good.

The only reproducible way to solve the "commit date" problem would be to have packagers run

git show --pretty=format:%cd --date=format:%Y%m%d --quiet REF

in the source tree for the REF they are packaging, and to explicitly set the %global commitdate to that value, which is then used for generating the Release tag ONCE and not have the value be different depending on the current phase of the moon.

Metadata Update from @decathorpe:
- Issue assigned to decathorpe

a year ago

So, I had some time to think about this, and so I wrote up this updated proposal / suggestion for a change to the current behavior / draft:

Change:

Do not use the current date for generating the Release tag during rpmbuild -bs (or other tools') invocation.

My reasoning:

  • Major issue: This makes builds non-reproducible.

It creates different Release tags depending on when rpmbuild -bs is run, including during the buildSRPMfromSCM task invoked by koji infedpkg build. Depending on the timing of builds, this leads to different results. It also makes rpmlint (and me!) unhappy because the EVR doesn't necessarily match the corresponding changelog entry anymore.

  • Minor issue: It obscures the correct chronological order of snapshots.

It is not possible to determine which snapshot is newer by looking at the snapshot dates. For example, switching back to an older commit increments the front digit in the Release tag, but should obviously also reflect the correct (older) date of the snapshot, and not a more recent (current) date, as it is now.

My suggested solution:

I suggest to use the date of the snapshot (e.g. for git: authored date) converted to UTC (so it's really reproducible) in YYYYMMDD format (or YYYYMMDD.HHmmSS if more granularity is needed) for this purpose, and not to use the current date at the time of running rpmbuild -bs.

This means that the packager has to specify %global date YYYYMMDD manually. I can publish a helper script that I wrote for this exact purpose, which generates this definition for git repositories (and I can add additional support for bzr and hg repos, if that is desired).

A possible compromise:

Allow (encourage) packagers to set %date manually before calling %forgemeta / %gometa, and only fall back to the current date if that variable is not defined. That way, "lazy" packagers can optionally rely on the current behavior, and pedantic packagers (like me) can specify the correct date.

Allow (encourage) packagers to set %date manually before calling %forgemeta / %gometa, and only fall back to the current date if that variable is not defined. That way, "lazy" packagers can optionally rely on the current behavior, and pedantic packagers (like me) can specify the correct date.

I don't like having a "do nothing" option that defaults to an arguably bad thing. Iff we want this option, then I advocate for having it explicit (e.g. a packager must set %date manually and may set it to %nil if they want the automagic date feature explicitly).

However I'd prefer to drop the automatic date thing entirely and have the packager must set the date manually. The lazy packagers as well as the pedantic packagers.

Sure, that's my opinion too.

I just wanted to point out that setting the date could be optional, but yeah, I'd rather have it compulsory.

We discussed this at this weeks meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2018-05-10/fpc.2018-05-10-16.00.txt):

  • x719 Simplify packaging of forge-hosted projects (geppetto, 16:38:09)
  • ACTION: decathorpe to ping tibbs and nim about changing the macro
    (geppetto, 16:50:49)

I've implemented and tested my proposed change locally.

The diff can be seen here:
https://src.fedoraproject.org/fork/decathorpe/rpms/redhat-rpm-config/c/74b8dd05bbc696e24d469d9c7e5e4f938cf6d028?branch=master

I'm open to suggestions concerning how this can be further improved.

Ok, thanks for clarifying.
However, I'm not so sure the "injecting the correct date" thing is accurate.
Building a package on different days injects different dates, despite the source being the same.

That should not happen unless something in the build infra changes the mtime in the archive version stored in the lookaside cache. As long as the mtime is preserved, the build result will be the same. I've done enough mock mass rebuild to be sure you can do a hundred rpmbuild -bs/mock rebuilds from the same source the generated date is unchanged unless you actually download a new archive or change the mtime of the existing one outside mock/rpmbuild.

I'll remind that the original FPC motivation for introducing date for commit in releases was the concern that scms allow changing what a commit or tag can point at, so commit id or tag by itself does not identify a specific code state, you also need to store the date at which the id existed. At the time scm=svn, git does not permit like svn to change commit id since it use hashes but tags can still be moved over time.

The current macro code implements this logic (fun fact: the behavior you're used to relies on people not applying the original FPC decision correctly as it's too much work to do it reliably manually). However the macro code requires that Fedora infra preserves the project archive mtime once it’s in the lookaside cache. If Fedora infra does not care about it and FPC does not care enough about its original decision enough to fix this, the whole "date in release for commits") should be be killed and not preserved in cargo cult mode.

Redefining the original decision from "date we pulled the commit" to "date the commit was created":
- implies removing it from the macro (it can not be automated sanely, and no other part of the macro need it). If you really want it it needs to be done outside this macro context
- assumes you actually know what specific date to assign to a commit (date it was done by the dev on its computer? Date it was pushed to a server? Date it was pulled from a dev branch to a feature branch? Date the feature branch was merged in a release branch? Date it was mirrored from a private scm to a public instance?) .
- supposes SCMS actually track whatever you think a commit date is accurately. And they don't.
1. We have a few people here working from hyper-v vms, and hyper-v thinks a resumed vm should keep the date it was paused at on resume. So any commit from those vms expose all the ways git and gitlab can't make up their mind between client and server date and mix them up in unholy ways depending on the command you use to ask for "commit date".
2. Other fun test: squash and rebase a few commits and check the date git assigns to the result. It won't be the one you wrote the code even before push/pulling anywhere.
- reintroduces tedious manual work
- reintroduces error-prone manual work (people forget to update the date every time, you can not rely on the result)
- is completely useless functionally, as modern scms like git use branches extensively, their commit history is not linear from a commit date POW

So my opinion on the whole thing is:
1. keep the current code it you can live with "date we pulled from upstream", fix eventual lookaside bugs so mtime is preserved on original upload
2. remove any trace of commit date otherwise
3. if you actually want to track a form of date which is not "date we downloaded from upstream", do it manually outside this macro in guidelines. But first make your mind on what exactly this date should be, check it actually exists in a stable reliable state in upstream forge systems, document accurately your expectations so packagers don't misinterpret them. And ask yourself if the benefit for Fedora is worth all this work.

That should not happen unless something in the build infra changes the mtime in the archive version stored in the lookaside cache. As long as the mtime is preserved, the build result will be the same. I've done enough mock mass rebuild to be sure you can do a hundred rpmbuild -bs/mock rebuilds from the same source the generated date is unchanged unless you actually download a new archive or change the mtime of the existing one outside mock/rpmbuild.

It looks like the mtime of the tarball from the lookisade cache is unreliable. For example, look at the koji build at https://koji.fedoraproject.org/koji/buildinfo?buildID=1078636, where the Release tag for the resulting package is different from the changelog entry.

Additionally, the mtime is different for the local source and the source from the lookaside cache, so any local modifications to packages (calculated Release tag, generated changelog entries, ...) are not in sync with anything that happens in the infrastructure.

I'll remind that the original FPC motivation for introducing date for commit in releases was the concern that scms allow changing what a commit or tag can point at, so commit id or tag by itself does not identify a specific code state, you also need to store the date at which the id existed. At the time scm=svn, git does not permit like svn to change commit id since it use hashes but tags can still be moved over time.

I understand the reasoning for documenting the date at which the tarball was generated (similar to how to citing websites work - "URL; accessed at DATE TIME") - however, this has nothing to do with a package version, and should therefore only be used to document when the tarball was generated (for example, in a comment in the .spec file) - if at all - and should not be part of the package version.

(Side note: if tags can be moved, but commit refs can't, why is it compulsory to include dates for commit shapshots, and not for tags then? :wink: )

The current macro code implements this logic (fun fact: the behavior you're used to relies on people not applying the original FPC decision correctly as it's too much work to do it reliably manually). However the macro code requires that Fedora infra preserves the project archive mtime once it’s in the lookaside cache. If Fedora infra does not care about it and FPC does not care enough about its original decision enough to fix this, the whole "date in release for commits") should be be killed and not preserved in cargo cult mode.

Even if the mtime of the tarball is preserved in the lookaside cache: If I prepare an update locally today, and tomorrow I upload the sources, push to git, and build the changes, the resulting packages will have a different Release tag than what I would expect from my local preparation (e.g. EVR in changelog entries don't match the new Release tag, etc.).

Redefining the original decision from "date we pulled the commit" to "date the commit was created":
- implies removing it from the macro (it can not be automated sanely, and no other part of the macro need it). If you really want it it needs to be done outside this macro context
- assumes you actually know what specific date to assign to a commit (date it was done by the dev on its computer? Date it was pushed to a server? Date it was pulled from a dev branch to a feature branch? Date the feature branch was merged in a release branch? Date it was mirrored from a private scm to a public instance?) .

The only sane way to define a "commit date" for a specific commit would be to read the "authored date" (or "committed date", we can argue about that) for the specific git commit, for example - which is uniquely determinable from the git repository, since it's tied to the unique commit hash (see the code snipped for generating that, which I posted above).

  • supposes SCMS actually track whatever you think a commit date is accurately. And they don't.
    1. We have a few people here working from hyper-v vms, and hyper-v thinks a resumed vm should keep the date it was paused at on resume. So any commit from those vms expose all the ways git and gitlab can't make up their mind between client and server date and mix them up in unholy ways depending on the command you use to ask for "commit date".
    2. Other fun test: squash and rebase a few commits and check the date git assigns to the result. It won't be the one you wrote the code even before push/pulling anywhere.
  • reintroduces tedious manual work
  • reintroduces error-prone manual work (people forget to update the date every time, you can not rely on the result)
  • is completely useless functionally, as modern scms like git use branches extensively, their commit history is not linear from a commit date POW

You're right, handling of date and time is a nightmare on computers. Still, we need to deal with it in a reproducible way, if possible.

So my opinion on the whole thing is:
1. keep the current code it you can live with "date we pulled from upstream", fix eventual lookaside bugs so mtime is preserved on original upload
2. remove any trace of commit date otherwise
3. if you actually want to track a form of date which is not "date we downloaded from upstream", do it manually outside this macro in guidelines. But first make your mind on what exactly this date should be, check it actually exists in a stable reliable state in upstream forge systems, document accurately your expectations so packagers don't misinterpret them. And ask yourself if the benefit for Fedora is worth all this work.

  1. The problem I see here is: "date we pulled from upstream" (mtime of the tarball) is not necessarily the same for the local system and on infrastructure systems, and - depending on lookaside cache behavior - not even the same between different builds in the build infrastructure.

  2. This would be the easiest solution.

  3. I'm not sure it is, but since I don't want to throw all Guidelines we have away right now, I just wanted to start with fixing small things.

However, I think you got us sidetracked - I just want to be able to have reproducible builds. I don't care if that is realised as either

  1. having to specify the date in the .spec file (optionally or not; but I'd prefer if it's recommended to specify it manually), or
  2. throwing the whole date string away for good.

And, as I said, I'm not ready for a debate about the second solution yet, which is why I proposed a simple solution for the first option.

It looks like the mtime of the tarball from the lookisade cache is unreliable. For example, look at the koji build at https://koji.fedoraproject.org/koji/buildinfo?buildID=1078636, where the Release tag for the resulting package is different from the changelog entry.

Actually, all that shows it that it is different. It does not say if the mtime is changed each time it's retrieved from lookaside for a rebuild , or if it was broken by the tooling that uploaded the archive to the lookaside cache (and we have various ways to upload I'd expert importing as srpm to preserve archive mtimes)

Additionally, the mtime is different for the local source and the source from the lookaside cache, so any local modifications to packages (calculated Release tag, generated changelog entries, ...) are not in sync with anything that happens in the infrastructure.

If you're used to downloading your own project sources to work locally, instead of starting
from the version stored Fedora-side, that's 100% expected (and again there are various ways to retrieve sources from Fedora infra, doing a rpm -Uvh of the latest srpm Fedora built should get you the exact archive state and mtime fedora-side).

I understand the reasoning for documenting the date at which the tarball was generated (similar to how to citing websites work - "URL; accessed at DATE TIME") - however, this has nothing to do with a package version,

That's the exact reason why it is in Fedora releases today, dig up FPC irc archives from the time if they're still archived somewhere, you'll see. It's not for ordering releases, it's too far in the release sting to change order and that's intentional.

(Side note: if tags can be moved, but commit refs can't, why is it compulsory to include dates for commit shapshots, and not for tags then? 😉 )

Because writing guidelines is a lot of work and FPC expected everyone to refer to its minutes and not make up their own logic in case of doubt. Hindsight is 100/100, the decision was under-documented, and several layers of guidelines rewrites since didn't help. That's not something current FPC is much better at today BTW.

Even if the mtime of the tarball is preserved in the lookaside cache: If I prepare an update locally today, and tomorrow I upload the sources, push to git, and build the changes, the resulting packages will have a different Release tag than what I would expect from my local preparation

Only if the toolchain does not preserve the mtime when uploading/downloading, or if you find it easier to re spectool new archives than pull them from the lookaside cache. The macro has no reason to invent new results from the same inputs

The only sane way to define a "commit date" for a specific commit would be to read the "authored date" (or "committed date", we can argue about that) for the specific git commit, for example - which is uniquely determinable from the git repository, since it's tied to the unique commit hash (see the code snipped for generating that, which I posted above).

Yes, sure, except that's not cheap to obtain, it forces the packager to work from git checkouts, you need to define it for the other scms too, and the date the scm stored can be garbage anyway.

You're right, handling of date and time is a nightmare on computers. Still, we need to deal with it in a reproducible way, if possible.

Not if possible. If it's worth the pain. It's ok to do things with marginal benefit if they have a marginal cost in packager and guidelines writing time. My solution had zero packager cost so it was ok if it had little benefit (and the little it had was the one expected by the people who decided to do dates in fedora releases originally). The kind of date you propose has not a marginal cost at all, where is the actual benefit for Fedora to do it? Aside from putting dates where you are used to see dates, even though it is not the dates that FPC deemed useful enough to write a guideline about in the past?

The problem I see here is: "date we pulled from upstream" (mtime of the tarball) is not necessarily the same for the local system and on infrastructure systems,

That's the feature FPC wrote the original guideline for: to detect when a packager does not work from the SCM archive the original packager of the commit/tag pushed to the lookaside cache. It seems our tools do not preserve archive mtime as would be necessary to make it work painlessly in a world where packaging commits is a lot more frequent than when the original guideline was written. Not sure it it means we should fix our tools or if it makes the costs to high wrt the original expected benefit.

However, I think you got us sidetracked - I just want to be able to have reproducible builds.

You can have this
1. by fixing the tooling that does not preserve the mtime of an archive that should not be modified once uploaded, or
2. by removing the date requirement, or
3. by punting the work on packagers

Solution one has an infra cost, solution three has a recurrent human packager cost, solution two has the cost of giving up on release decorations with so little benefit FPC didn't even remember why they were set up in the first place. Therefore, I would favor solution 2 (even if I spent a lot of time and energy to make solution 1 work). I don't see the point of investing infra or packager time on something with no clear benefit.

Choosing solution 3 just because without any clear understanding of why you do it is pure cargo culting.

If you're used to downloading your own project sources to work locally, instead of starting
from the version stored Fedora-side, that's 100% expected (and again there are various ways to retrieve sources from Fedora infra, doing a rpm -Uvh of the latest srpm Fedora built should get you the exact archive state and mtime fedora-side).

How can I download something from the lookaside cache that has never been uploaded before (i. e. when preparing an update)? I have to download the tarball by running spectool -g locally first, which I then upload with fedpkg new-sources (which then introduces the possible differences between dates) ... but I can't tell fedora infrastructure to give me a tarball that has never existed before.

Choosing solution 3 just because without any clear understanding of why you do it is pure cargo culting.

Sorry, but telling me that I have no idea what I'm doing doesn't help.

sigh


Okay, I'll make one last attempt to explain what I want:

I'd like to have a way to do reproducible builds, even if it is optional.

For example: the computed %dist could use %date if it is defined in the .spec file, otherwise it can fall back to using the mtime of the sources (as it does now).

Having a date (of whatever event) in the Release string or not is a problem completely orthogonal to this ticket, and to the changes that I have proposed - and should be discussed separately.

I have to download the tarball by running spectool -g locally first

That will give you a sane automated mtime.

If you feel original commit creation date, outweighs the inevitable variations in archive creation, introduced by forge infra changes over time, you can even ask gitlab/github/pagure/stc to generate archives dynamically with the commit timestamp (this could be argued either way and is probably trivial to code at their side).

which I then upload with fedpkg new-sources (which then introduces the possible differences between dates)

fedpkg new-sources should not change the mtime as it’s not supposed to tamper source archives. If it does, that's a bug.
https://pagure.io/fedpkg/issue/220

Now, the only question is wether FPC cares enough about dates in release strings to get the bug fixed. (Cares enough = is able to articulate a concrete benefit for the project that is worth the tool fixing or guidelines writing and manual packager work).

Is there any doubt, than fixing a tooling bug, is cheaper mid term than specifying recurrent manual packager work forever? Not forgetting that manual work can not be more reliable than humans themselves.

I'd like to have a way to do reproducible builds, even if it is optional.

Then remove the whole date thing. That's the cheapest fastest and most reliable way to attain this objective. If you disagree there your objectives are more than just reproducible builds and again, my question remains: is it worth the work?

Removing the date would work for me. It's tedious to update, it's weird that it's not the date of the commit, but the date you fetched the commit first etc.

I'm also in favor of throwing out the date entirely.

I just wanted to propose a simpler solution (allowing packagers to specify it manually) for the short-term. But just removing the date solves the issue, too, of course.

By "remove the whole date thing" you mean modifying the versioning guidelines to not require the specification of a date for snapshots? If so, that would be a significant departure from what we've done in the past and should certainly be discussed separately from this ticket.

By "remove the whole date thing" you mean modifying the versioning guidelines to not require the specification of a date for snapshots? If so, that would be a significant departure from what we've done in the past and should certainly be discussed separately from this ticket.

Exactly my point. This is why I only proposed a possible improvement to the status quo here - and not opened another issue entirely (removing the date).

By "remove the whole date thing" you mean modifying the versioning guidelines to not require the specification of a date for snapshots? If so, that would be a significant departure from what we've done in the past and should certainly be discussed separately from this ticket.

This issue has degenerated into the various meanings of date in releases, their drawbacks and what to do if we wanted to fix each of them. Regardless of the decision it will imply revisiting the date guideline either to scrap it or to rewrite it in an explicit way.

The alternatives are:
1. keep the original FPC decision about archive commit mtime in releases → requires clarifying the current guidelines as they are often misunderstood (+ fixing fpkg)
2. change the original FPC decision to commit creation date → requires a decision change and a guideline clarification (to make sure the correct date is chose manually, requiring to document each scm system as each of them uses different commands)
3. scrap the whole date thing → again requires a guidelines change

There is no way it can progress without touching this guideline, it was pretty much implicit since FPC mandated decathorpe to change date handling and it caused the revisiting of the whole date in release mess

Honestly, I think discussion in this ticket has somewhat deteriorated.

I agree that the "date issue" as a whole should be fixed (or at least clarified). But that is not the point of this ticket, and not the point I wanted to make.

I think I offered a good explanation and reasoning for my (small!) change suggestions, and I also proposed a possible compromise to keep compatibility with the current behaviour. To me it just looks like none of that has been taken seriously, but got lost in a bigger issue, instead.

We discussed this ticket at today's FPC meeting (2018/05/31):

  • The snapshot date issue might get resolved soon. Either globally - by removing the date from release strings altogether, or "locally" - by allowing packagers to optionally specify the date manually if they don't want to rely on automagic macros.

  • Are the linked Drafts / Is the linked Draft still up to date with the current implementation of the macros in redhat-rpm-config? If yes, please just post a link to the current version below. If no, please update the draft and post an updated link, so we can finally vote on this ticket soon.

So I was working on a package which happens to use a post-release github snapshot and figured I'd spend a bit converting it over to use these macros.

In general it works well, though as previously mentioned the auto-date thing is a little weird. But what really gets me is that it's still not completely clear to me from the document (https://fedoraproject.org/wiki/Forge-hosted_projects_packaging_automation) whether I'm supposed to call %forgemeta before or after the Release: line.

  • Before Release:: solaar-0.9.2-6.20180720git59b7285.fc29.src.rpm
  • After Release:: solaar-0.9.2-6.20180720git59b7285.20180720git59b7285.fc29.src.rpm

The spec has just this:

%global forgeurl https://github.com/pwr/Solaar
%global commit 59b7285fdfc875119f0c92cfd5f5909e8a8e578c

Name:           solaar
Version:        0.9.2
Release:        6%{?dist}
Summary:        Device manager for Logitech Unifying Receiver
%forgemeta
URL:            %forgeurl
Source0:        %forgesource

Maybe the document I'm looking is out of date. I'm not really sure what's supposed to happen.

Edit: Also just for the record, rpmlint doesn't know about %forgeautosetup and so tells me that I'm not applying a patch.

But what really gets me is that it's still not completely clear to me from the document (https://fedoraproject.org/wiki/Forge-hosted_projects_packaging_automation) whether I'm supposed to call %forgemeta before or after the Release: line.

Sorry, I missed this. I always write it before Release, since it modifies %{dist}, and I don't really know what would happen if it was called after %{dist} was already used in the spec

I think all the examples in the wiki show it before Release, so I'm not sure what confuses you (it I knew I'd change the wiki page)

Now that it spent some time in EPEL and Fedora with no reported bugs I'll probably start diverging the implementation in rawhide (to take advantage of bits like https://bugzilla.redhat.com/show_bug.cgi?id=1524192)

That will probably make the implementation less quirky

I'm not really sure what's supposed to happen.

Nothing good, you're not supposed to do it :)

I provided an example spec and showed just how broken things are if I call %forgemeta before Release:. Look a few comments back. That should explain my confusion. Currently I use the forge macros for two packages; both work fine if I call %forgemeta after Release: and both break if I call it earlier.

Also, the fix for 1524192 was originally not correctly applied. The fedora-release spec was missing a '%' so %distprefix was expanded when fedora-release was built. I'm pretty sure it's OK in both rawhide and F29 now, though.

I provided an example spec and showed just how broken things are if I call %forgemeta before
Release:. Look a few comments back.

I'm seeing

%global forgeurl https://github.com/pwr/Solaar
%global commit 59b7285fdfc875119f0c92cfd5f5909e8a8e578c

Name:           solaar
Version:        0.9.2
Release:        6%{?dist}
Summary:        Device manager for Logitech Unifying Receiver
%forgemeta
URL:            %forgeurl
Source0:        %forgesource

That produces the broken solaar-0.9.2-6.20180720git59b7285.20180720git59b7285.fc29.src.rpm

That's not really supposed to work (it would be nice but the spec uses %{dist}in 6%{?dist} before the %forgemetamacro call which is supposed to set it, so not really surprising if it makes a mess in rpm evaluation.

You report

%global forgeurl https://github.com/pwr/Solaar
%global commit 59b7285fdfc875119f0c92cfd5f5909e8a8e578c
forgemeta

Name:           solaar
Version:        0.9.2
Release:        6%{?dist}
Summary:        Device manager for Logitech Unifying Receiver

URL:            %forgeurl
Source0:        %forgesource

works (solaar-0.9.2-6.20180720git59b7285.fc29.src.rpm) and indeed that's what documented. I think you had a slight brain overload leading to confusion between
* post-release, as in "the snapshot was released after the version" and
* post-release, as in "write the macro call after the Release line".

Unless I'm the one totally confused here. Nevertheless switching to distprefix now it exists will permit to remove the dist overload hack the macro currently uses, which should make rpm dist evaluation simpler to use and understand.

Nevertheless switching to distprefix now it exists will permit to remove the dist overload hack the macro currently uses, which should make rpm dist evaluation simpler to use and understand.

And here's the associated PR
https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/34

We talked about this at todays meeting (https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2018-09-06/fpc.2018-09-06-16.00.txt):

  • #719 Simplify packaging of forge-hosted projects (geppetto, 16:16:09)
  • ACTION: tibbs to deal with PR soon, hopefully today (geppetto,
    16:20:50)
  • Need some policy for this, probably in Packaging:SourceURL
    (geppetto, 16:21:06)

We talked about this at this weeks meeting (https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2018-10-25/fpc.2018-10-25-16.00.txt):

  • #719 Simplify packaging of forge-hosted projects (geppetto, 16:08:40)
  • ACTION: ignatenkobrain will look at the draft, so tibbs can spend
    some time on ~ ticket ;) (geppetto, 16:17:34)

And it’s here:
https://pagure.io/packaging-committee/pull-request/826
https://src.fedoraproject.org/rpms/redhat-rpm-config/pull-request/47

PR47 is here so templates can be written and updated at the same time the macro code changes. Otherwise we get in the current situation, where the doc is written months after the code itself.

Metadata Update from @decathorpe:
- Issue assigned to ignatenkobrain (was: decathorpe)

5 months ago

We talked about this issue at this weeks meeting ... https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2018-12-13/fpc.2018-12-13-17.04.txt

  • #719 Simplify packaging of forge-hosted projects (geppetto, 17:12:21)
  • LINK: https://pagure.io/packaging-committee/pull-request/826
    (geppetto, 17:12:58)
  • ACTION: decathorpe will write "executive summary" in textual form
    (geppetto, 17:20:04)
  • ACTION: fix trivial things and merge (+1:6, 0:0, -1:0) (geppetto,
    17:28:29)

Metadata Update from @james:
- Issue assigned to decathorpe (was: ignatenkobrain)

5 months ago

Metadata Update from @james:
- Issue untagged with: meeting
- Issue tagged with: writeup

3 months ago

Login to comment on this ticket.

Metadata