#1132 Mark comments as scriplets for Sources to be used in automation
Opened 2 years ago by pvalena. Modified 2 years ago

Followup for https://pagure.io/packaging-committee/pull-request/1046

Please consider adding some marking for comments above Source, maybe with some "tag" / "prefix", so that they're known to be run by automation.

# Script: git clone ... && cd ... && git archive ....
Source1: ....

This would help with automated custom source creation, which otherwise needs to be run manually (& checked).


TBH if there's some "special sauce" commands necessary to generate sources (that involves more than one simple command), I add and commit them as a separate script into the dist-git repo as something like ./gen_clean_tarball.sh. Would that look like this with your proposal?

# Script: ./gen_clean_tarball.sh
Source1: %{name}-%{version}-clean.tar.gz

TBH if there's some "special sauce" commands necessary to generate sources (that involves more than one simple command), I add and commit them as a separate script into the dist-git repo as something like ./gen_clean_tarball.sh. Would that look like this with your proposal?

```

Script: ./gen_clean_tarball.sh

Source1: %{name}-%{version}-clean.tar.gz
```

I agree that would be a possibility, but simple git checkout and git archive do not seem like they need to be in a separate script to me, which adds another file (overhead) to maintain. For readability purposes, we split it into two lines. Ex.:

https://src.fedoraproject.org/rpms/rubygem-activesupport/blob/rawhide/f/rubygem-activesupport.spec#_13
https://src.fedoraproject.org/rpms/rubygem-ffi/blob/rawhide/f/rubygem-ffi.spec#_10
https://src.fedoraproject.org/rpms/rubygem-puma/blob/rawhide/f/rubygem-puma.spec#_16

I don't fully understand what's being requested here because I am having trouble parsing the first sentence in the first message. However, I think I would propose the following rule:

If any Source: tag includes a file that is not a URL, the packager SHOULD include a comment above that Source: tag indicating where the file comes from and if applicable, instructions for generating that file.

This would probably go in a new section just before the "Patch Status" section, and would also link to the Source URL document which could potentially be expanded to include more requirements on clean tarball generation if we wanted to formalize that. I left this as a SHOULD because there are a bunch of cases where you include a script or something which is kind of obviously part of the packaging and don't want to try to distinguish between all of the cases where you might stick something that's not a URL in a Source: tag.

Metadata Update from @tibbs:
- Issue tagged with: meeting

2 years ago

I don't fully understand what's being requested here because I am having trouble parsing the first sentence in the first message. However, I think I would propose the following rule:

Sorry. I'll try to clear it up-
the thought was about having multiple short commands (e.g. two git calls). What I think is:

  • In such a case the "script file" would add unnecessary complexity, adding to package maintenance cost.
  • Preferred way would be to include the commands in the spec file like we do now (in examples I've sent).
  • This should be decided case by case by the maintainer.

If any Source: tag includes a file that is not a URL, the packager SHOULD include a comment above that Source: tag indicating where the file comes from and if applicable, instructions for generating that file.

That sounds great! But would there be a distinction for running the commands in an automated way? My main goal is to clarify this.

This would probably go in a new section just before the "Patch Status" section, and would also link to the Source URL document which could potentially be expanded to include more requirements on clean tarball generation if we wanted to formalize that. I left this as a SHOULD because there are a bunch of cases where you include a script or something which is kind of obviously part of the packaging and don't want to try to distinguish between all of the cases where you might stick something that's not a URL in a Source: tag.

I agree.

So we talked about this a little, and it seems to me at least that the problem is actually pretty complicated.

We could recommend that a shell script be used to generate shippable tarballs, and that the script have a particular name. In theory that helps with automation, but what does it actually get you? The script can't generate the tarball without being told something, so we have to recommend the arguments it takes. And any automation would need to know how to pass them.

And what would be responsible for fetching the unshippable tarball? We can't use Source: tags, obviously, unless we also hard mandate corresponding NoSource: tags, which is esoteric and easy to mess up. It could be in a specfile comment, but using comments for automation is a pretty bad idea. Getting RPM to support a new tag would be possible, but would take time and wouldn't propagate back to older versions of RPM.

Ideally, I think that a single script with a defined name and a defined argument (the package version) which is responsible for downloading the upstream unshippable tarball and generating the safe tarball would be the way to go, except that's probably just way too much to ask people to do and in the end I'm not sure if it helps automation anyway.

One interesting idea that came up is whether somehow fedpkg could be leveraged or taught to do any of this. I'm sure if would need some kind of metadata (just how to find the upstream source and what files to delete from the archive), but teaching it to take care of things would mean that packagers wouldn't have to cook everything up on their own. The guidelines could just recommend the use of the fedpkg features and document what files go where.

So we talked about this a little, and it seems to me at least that the problem is actually pretty complicated.

Agreed. Thank you for taking time for this!

We could recommend that a shell script be used to generate shippable tarballs, and that the script have a particular name. In theory that helps with automation, but what does it actually get you? The script can't generate the tarball without being told something, so we have to recommend the arguments it takes. And any automation would need to know how to pass them.

Well, it would be nice to have a macro, f.e., instead of a script here, so we'd avoid another source (which is not supposed to be shipped). But this would require network access for the macro, which is why this is currently handled in the comments. Having this is script (generic) is just wrapping the one-liner functionality (which is not-transparent), but it has to be executed by the automation with some assumptions as well (this does not scale).

And what would be responsible for fetching the unshippable tarball? We can't use Source: tags, obviously, unless we also hard mandate corresponding NoSource: tags, which is esoteric and easy to mess up. It could be in a specfile comment, but using comments for automation is a pretty bad idea. Getting RPM to support a new tag would be possible, but would take time and wouldn't propagate back to older versions of RPM.

Comments is what we currently use in the automation (which is undesired; that's why I started this conversation). Maybe the correct way would be to do this all in the CI instead, and generate the required source on the fly. I've not seen Nosource tag yet, I'll check that. Having an RPM support does not bring any benefit here, as it would be a NOOP command anyway (in case of the build).

Ideally, I think that a single script with a defined name and a defined argument (the package version) which is responsible for downloading the upstream unshippable tarball and generating the safe tarball would be the way to go, except that's probably just way too much to ask people to do and in the end I'm not sure if it helps automation anyway.

Yes, that's what I've encountered in some packages, but It adds overhead (additional file), and is non-transparent. Also, I aim for very specific use. Currently I'm checking the commands that get run, which ideally are just two git commands, like here. Having a custom-script executed goes against this.

One interesting idea that came up is whether somehow fedpkg could be leveraged or taught to do any of this. I'm sure if would need some kind of metadata (just how to find the upstream source and what files to delete from the archive), but teaching it to take care of things would mean that packagers wouldn't have to cook everything up on their own. The guidelines could just recommend the use of the fedpkg features and document what files go where.

Sounds great! Maybe we could use I script I already have as a basis for this. What I'm actually generating is not a Source0, IOW not supposed to be shipped. What I'm generating is additional one (Source1), with tests (for %check), which are not shipped with upstream tarball.

Thanks for the discussion!

What I've realized is that any (mine, fedpkg, or some other packagers') implementation needs some metadata / config / script arguments from somewhere. These could be in theory (hackishly) fully auto-detected, but when user needs / wants to modify or fix the behaviour, it still needs to be configured / defined somewhere.

Which leads me to:

  • RPM / Spec file support, which may vary with the actual implementation[*], and it either:

    • does nothing (the configuration is just read and used by e.g.fedpkg)
    • or everything! And it's expected to run in Koji. The produced custom Source only needs to be available during %check and is not used for anything else.
  • External config file

    • More generic solution, open to more implementations. Could even generate regular Source0 files if needed.
    • Additional file for maintenance, with very little benefit. Also it would be very handy to use %{version}, %{name} or %{URL} etc. in this file, which is IMO hackish when used with automation.
  • No RPM support, but still using .spec file

    • Like %global CheckSource etc.. It could be just documented / recommended approach, and I could adjust my tooling to use that (or we could even add support in fedpkg).
    • Doesn't require any 3rd party modification or support to work; doesn't use comments or additional config file.

Which of the options do you think is best? I should probably also start a discussion on fedora-devel to get more opinions.

[*] Let's say up to some definitions like CheckSource:, CheckSourceCommit:, CheckSourceFiles: etc.)

If you want to go with option 2, that could probably go into spectool? You need to run it anyway for getting the "primary" source, and having it read some config file to fetch a secondary source (or a "cleaned up" primary source) would fit right in.

If you think that's a good way to solve this problem, please open an RFE with pagure.io/rpmdevtools. I am the maintainer of the new / rewritten-in-python spectool and would be open to implementing such a feature.

If you want to go with option 2, that could probably go into spectool? You need to run it anyway for getting the "primary" source, and having it read some config file to fetch a secondary source (or a "cleaned up" primary source) would fit right in.

@decathorpe thanks!

Actually, we don't need spectool for primary source (we use gem fetch XYZ), but you're right, it might be a good place for the implementation. I'd however prefer any solution that does not need additional files.

If you think that's a good way to solve this problem, please open an RFE with pagure.io/rpmdevtools. I am the maintainer of the new / rewritten-in-python spectool and would be open to implementing such a feature.

I'm not sure, TBH, what's the best way to solve this (hence this issue). Currently, we use implementation (or it can be executed manually) as follows:

  • Parse this part of the spec file:
# Tests are not packaged with the gem. You may get them like so:
# git clone --no-checkout https://github.com/discourse/mini_mime
# git -C mini_mime archive -v -o mini_mime-1.1.0-test.txz v1.1.0 test
Source1: %{gem_name}-%{version}-test.txz
  • Ignore generic comments, and the Source itself (I know!), and strip to the git/cd commands (also changed the version 'comment' to current one... sed)
git clone --no-checkout https://github.com/discourse/mini_mime
git -C mini_mime archive -v -o mini_mime-1.1.0-test.txz v1.1.0 test
  • Execute. This creates an archive, which is then copied back to the folder with the spec.

As a whole, including the testing, it looks like this: https://src.fedoraproject.org/rpms/rubygem-mini_mime/pull-request/2 (this was completely done by automation)

I wonder. Wouldn't be a well-defined, configurable set of actions (git clone, git archive, etc.) be "safer" (and certainly easier) than parsing custom, user-supplied scripts?

@decathorpe Yes, that's what I aimed at, with the first and 3rd options in my earlier comment, but is that valid use .spec file?

[*] definitions like CheckSource:, CheckSourceCommit:, CheckSourceFiles: etc.)

Also, after some discussion with @vondruch, the RPM tags (option 1) are not backward-compatible, which would bring many issues downstream...

What about simply using %global or %define for that (option 3)?

I still think additional config (option 2) adds too much overhead for this.

I think it's probably still still less overhead then waiting for an (not backwards-compatible) upstream RPM change, and less overhead than parsing the whole .spec file for %globals :)

As I said, I'm open to adding support for "generating modified tarballs" to spectool, but only if you actually want that. (it would also mean that you could use spectool for getting both the .gem and for generating the test tarball, no need for using two separate tools at that point).

I think it's probably still still less overhead then waiting for an (not backwards-compatible) upstream RPM change, and less overhead than parsing the whole .spec file for %globals :)

Well, it would mean changing the logic & creating config for some ~84 rubygem packages. Also the config needs to be maintained ...

Parsing spec file is what I already have, and what other maintainers are already used to, so I'll ask on devel list, before proceeding, to get more opinions.

As I said, I'm open to adding support for "generating modified tarballs" to spectool, but only if you actually want that. (it would also mean that you could use spectool for getting both the .gem and for generating the test tarball, no need for using two separate tools at that point).

Thanks!

Given there is this PR, how about:

$ git stash 
Saved working directory and index state WIP on rawhide: 02f7c34 Update to mini_mime 1.1.2.

$ ll
total 8
-rw-rw-r--. 1 vondruch vondruch 3268 Dec 14 15:11 rubygem-mini_mime.spec
-rw-rw-r--. 1 vondruch vondruch  325 Dec 14 14:38 sources

$ fedpkg srpm 
Downloading mini_mime-1.1.2.gem
######################################################################## 100.0%
Remove downloaded invalid file /home/vondruch/fedora-scm/others/rubygem-mini_mime/mini_mime-1.1.2.gem
Could not execute srpm: Server returned status code 404

$ git stash pop
On branch rawhide

... snip ...

Dropped refs/stash@{0} (64e2a8d9bf91098b425078cbf0d5c354a92e348e)

$ git diff
diff --git a/rubygem-mini_mime.spec b/rubygem-mini_mime.spec
index a77cbc6..40bffdb 100644
--- a/rubygem-mini_mime.spec
+++ b/rubygem-mini_mime.spec
@@ -8,10 +8,21 @@ Summary: A lightweight mime type lookup toy
 License: MIT
 URL: https://github.com/discourse/mini_mime
 Source0: https://rubygems.org/gems/%{gem_name}-%{version}.gem
+%{echo:%(
+[ ! -e %{S:0} ] && \
+  gem fetch %{gem_name} -v %{version}
+)}
 # Tests are not packaged with the gem. You may get them like so:
 # git clone --no-checkout https://github.com/discourse/mini_mime
 # git -C mini_mime archive -v -o mini_mime-1.1.2-test.txz v1.1.2 test
 Source1: %{gem_name}-%{version}-test.txz
+%{echo:%(
+[ ! -e %{S:1} ] &&
+  rm -rf %{name} &&
+  git clone https://github.com/discourse/mini_mime %{name} &&
+  git -C %{name} archive -v -o %{S:1} v%{version} test/
+)}
+

$ fedpkg srpm 
Not downloading already downloaded mini_mime-1.1.2.gem
Not downloading already downloaded mini_mime-1.1.2-test.txz





setting SOURCE_DATE_EPOCH=1638489600
Wrote: /home/vondruch/fedora-scm/others/rubygem-mini_mime/rubygem-mini_mime-1.1.2-1.fc36.src.rpm

$ ll
total 108
-rw-rw-r--. 1 vondruch vondruch 20480 Dec 14 15:16 mini_mime-1.1.2-test.txz
-rw-rw-r--. 1 vondruch vondruch 34816 Dec 14 15:16 mini_mime-1.1.2.gem
drwxrwxr-x. 1 vondruch vondruch   254 Dec 14 15:16 rubygem-mini_mime
-rw-rw-r--. 1 vondruch vondruch 42251 Dec 14 15:16 rubygem-mini_mime-1.1.2-1.fc36.src.rpm
-rw-rw-r--. 1 vondruch vondruch  3268 Dec 14 15:15 rubygem-mini_mime.spec
-rw-rw-r--. 1 vondruch vondruch   325 Dec 14 14:38 sources

Can this issue be closed, or was there something else that you wanted the FPC to discuss/decide? I'm moving the issue to needinfo because nothing has really happened in months, but it might be better to just close and reference it in a new issue if you have follow on questions/requests.

Metadata Update from @james:
- Issue untagged with: meeting
- Issue tagged with: needinfo

2 years ago

I'm sorry, but I don't understand why this should be closed. I was expecting that FPC will evaluate my proposal and provide some feedback. However, I am fine to open new ticket, because it is probably quite different from the original ticket.

What there remains for me is what would the official guideline be (it's not about the tooling), even though it's not standardized via RPM (and there's no reason for it to be).

Best practises or some other documentation would be great.

Login to comment on this ticket.

Metadata