#2532 F34 Change: Enable spec file preprocessing
Closed: Rejected 3 years ago by zbyszek. Opened 3 years ago by bcotton.

This change should enable an opt-in spec file preprocessor in Fedora infrastructure for the benefit of packagers. The preprocessor allows some very neat tricks that were impossible before, for example generate changelog and release automatically from git metadata or pack the entire dist-git repository into an rpm-source tarball (effectively allowing unpacked repos to live in DistGit).


The number of packager complexities that this causes, plus the issues for downstreams and sidestreams makes me really weary of this change. -1

I recognise that a lot of work seems to have been done for this proposal, but after reading the documentation what the templates can do I just do not see enough benefits for packagers that would outweigh the downsides when using this for "official Fedora packages" (for mass changes, provenpackagers, and Fedora downstreams). Introducing a templating functionality on top of the already under-specified RPM .spec language certainly won't cause any issues either ... :/ -1

It looks like this functionality is already available for COPR however, which is where I see the biggest potential use case for this templating functionality - since the packages don't need to be "uniform" or even conform to packaging guidelines.

Based on the feedback this proposal received, I'm -1.

@ngompa I wonder how many examples you could name where this would cause problems - it requires using Fedora DistGit sources verbatim and then using bare rpmbuild to build srpms. Where this is done? Can you name a few places so that I could take a look?

Note that other mentioned solution will suffer by similar problems in this respect and actually probably even worse if there are extensive infrastructure changes required for the solution. When e.g. spec with %autorel macro is ported, it will just stop working as it requires particular integration of build system and dist-git to keep it working. Did people complain about it before? No.

Generally, I consider these -1s really quite ridiculous with all due respect. It is preference of regulation over innovation. I am glad that at least @decathorpe went through the specs although I disagree this doesn't have enough potential to outweight the "negatives".

Some guys didn't even state a reason for -1 @ignatenkobrain huh? I consider this disrespect to my quite hard work on this.

Generally, this makes me very disappointed because I don't feel your reasons for rejections are justified. I will very much reconsider my future contributions to Fedora after this.

I feel that my reasoning for this has been quite justified in the devel thread. If you have a big idea like this, I recommend trying to figure out if (Fedora) packagers actually want it before spending so much energy on it: gathering feedback early is a good way to avoid disappointment over wasted efforts.

However, since you asked for reasoning:

  • this makes regular work more complicated and fragile for a large group of our (proven)packagers
  • this makes regular work more complicated and fragile for an unknown group of our downstreams
  • this makes "magic" easier for a very small group of interested packagers (so far I know about you)

As @decathorpe said, this makes perfect sense in automated and unofficial packaging¹ (as any other spec file generator / templating / pre-processor). However, Fedora is currently very tightly coupled to having the spec file in distgit. If we want to get rid of this idea, why settle on a "powerful but limited" spec file templating language, instead of allowing fully generated spec files altogether, by arbitrary scripts? I am not saying I'd want to do that, certainly not yet, but I am trying to point out that your proposal brings too much friction for very little benefit. Not only I personally think that this is not a good way to do progress in packaging², but I've also been following the feedback others gave to the proposal: There is simply not enough buy in. It has been unfortunately shown in the past that even brilliant ideas will fail when there is no community buy in, because only very limited number of people will care to fix/improve/work on this and others will only suffer from (unwanted) negative side effects, spending their time trying to workaround things.

I appreciate that you spent time to try to make packaging easier. However, I sincerely believe this change proposal is not worth it. I am sad that it makes you want to reconsider you future contributions to Fedora, however that alone should not be the reason to "give this a chance".

¹ Although I dislike that the SCM source option in Copr uses this without a way to opt out, which makes it hard to build via SCM from distgit compatible sources out of the packager control (the new distgit source option finally remedies this, but is not as powerful as the SCM option). However, this is out of scope here.

² IMHO instead of developing layers on top of spec/RPM, we should gradually improve our packaging macros and capability of spec/RPM processing/parsing. This is more or less what I said in my FESCo interview when I was elected this term, so I guess at least some poeple agree with this sentiment.

Metadata Update from @churchyard:
- Issue tagged with: meeting

3 years ago

Hey, first of all, thank you for the sincere response.

I feel that my reasoning for this has been quite justified in the devel thread. If you have a big idea like this, I recommend trying to figure out if (Fedora) packagers actually want it before spending so much energy on it: gathering feedback early is a good way to avoid disappointment over wasted efforts.

At various points, I received a positive feedback about this (e.g. Flock or DevConf). I guess a part of the problem is that I took a long time to develop it. There were problems that I wasn't sure I am approaching correctly (e.g. input params for git_release macro or role of %{?dist}) and I didn't have anybody to ask about it. It was important for me to be sure that the whole solution is valid and can be used as is without any future changes (except additions). Something like that takes time, especially if one doesn't have anybody to consult.

However, since you asked for reasoning:

  • this makes regular work more complicated and fragile for a large group of our (proven)packagers

I don't know about fragile...Maybe slightly more complicated in some cases...but you can still grep/sed the templates together with all the other spec files. You just need to use *.spec* glob instead of *.spec. Is it that difficult? I wasn't doing any mass spec changes before so maybe you see some other problem...I don't see it. As for release bumping, there would be working rpmdev-bumpspec.

  • this makes regular work more complicated and fragile for an unknown group of our downstreams

I think this problem is overstated. I am yet to see some examples where people outside of Fedora, import Fedora spec files and they use a different tool than mock to build them. I saw the example with the rust specs ported into OpenSuse but there the person doing the port is actually involved in Fedora and involved in rust packaging so he can make adjustments as needed.

We use Fedora (actually CentOS) spec files at company where I work very occasionally when some change is needed. I wouldn't consider building them with rpmbuild especially because I don't want to bother with rpmbuild flags to build srpm or rpm from the dist-git repo (and also I wouldn't want to bother with manually installing build requires when building an actual rpm).

I would really like to see some good examples of this as a problem because I don't think this is as big issue as people made it seem to be.

Additionally, with rpmautospec, people wanting to use such a spec will be getting incorrect release (probably 0.x) and empty changelog unless they install some additional package with macros that depends on git (this is a quick guess of what will happen, further look into this would be useful). And of course, the %autorel macro will no longer be able to correctly bump release in this new environment upon new changes because there is the buildsystem dependency. This is not a problem in my proposal if you use mock/fedpkg/rpkg - the release will continue to be bumped with a new commit (or tag).

Nim's approach wouldn't probably have this issue at all but it has other problems. It considers srpms to be the primary source for changelog and release. rpmautospec uses buildsystem and git, in my approach, I use only git.

  • this makes "magic" easier for a very small group of interested packagers (so far I know about you)

Well, it's hard for people to express their interest if some well-known contributors come and criticize it immediately. That is to be expected, however. Other and probably more important factor is that there are some competing solutions and if there is a solution coming from an RH employee or a well-known person, people will automatically favor that. I think there is a big social factor playing a role here. I also think it is hard to know correct data.
The number of "I would never use it" people vs "I might use it if the change is accepted" vs "I would like to use it" people would be interesting to know and is unknown at the moment.

Yes, I would use it for my packages to verify that it is working at all times but this is not really why I am doing this. And I actually don't want to push it if there are lots of packagers rejecting it completely. I hope this is not the case though.

As @decathorpe said, this makes perfect sense in automated and unofficial packaging¹ (as any other spec file generator / templating / pre-processor). However, Fedora is currently very tightly coupled to having the spec file in distgit...

I also want to have the spec file in dist-git, just the extended (more expressive) version of it.

If we want to get rid of this idea, why settle on a "powerful but limited" spec file templating language, instead of allowing fully generated spec files altogether, by arbitrary scripts?

Well, because you can't imho fully automate packaging from start to finish without giving a packager a place where he/she can tweak/adjust things. And if there is a need for such place, then a file is a suitable solution. So actually, why not just use spec files and only add what is missing for the dynamic generation of them? Templating seems to be a good choice.

You say "powerful but limited". I don't actually see the "limited" part. You can have a spec file template containing a single line saying "generate me rpm spec from this wheel". And that's it. People will actually have a chance to discover how a particular package is generated by exploring dist-git repos as they are used to. And packagers will have a single file that they can adjust to make things work as they like. Especially valuable if multiple people maintain the same package. They still have the shared tracked file to edit (not some toggles and switches in build system or other web service).

I am not saying I'd want to do that, certainly not yet, but I am trying to point out that your proposal brings too much friction for very little benefit.

Imho there is a huge potential for benefit. The automatic changelog and release is one part.
Then there is this thread: is dist-git a good place for work.

I took input from this thread and made the macros so that you can have a namespace in dist-git where unpackaged sources are stored. Then this repo can be linked into standard rpms/ repo (where the spec file is placed) e.g. through a submodule. This is useful for projects which often need patching. This was stated to be useful previously for kernel, for libvirt / qemu by Daniel P. Berrangé or even by you for the cpython in the thread linked above.

I don't think it's a small thing to allow people to do their work at place where it is supposed to be done.

Or having an option to do this, i.e. having a single spec file (template) in Fedora Dist-Git to do commit-to-commit rebuilds from upstream in COPR while using that same spec file for official builds in Koji with already packaged sources. Context-dependent git_cwd macros were designed for this. You can basically provide packagers with a modern and supported way of dealing with their packages while not breaking Fedora's DistGit canonicity or making them devise each their own bash script which needs to sed spec file by regular expressions.

You could think about using the preprocessing to also solve problems in Dockerfiles that need to contain image label according to dist-git branch, please see this thread

I don't want to make you tired but I think you can see that this has a potential for wide use. The only thing it first needs is to get support from main Fedora contributors like you.

(...one more point I have just remembered) If this was accepted, the changelog generation could be finally done in a way that rpm changelog always contains a valuable and correct info. Packager could e.g. say that he/she wants to create changelog from upstream information as well as downstream information. Downstream changes would be prefixed by downstream:, upstream changes would be prefixed by upstream: - all that automatically. This would be configurable so people could choose what they want (generating just from dist-git commits would be default but note that even in this case, you can edit the list of commits in editor before it becomes part of changelog, you can edit this information later too).

Not only I personally think that this is not a good way to do progress in packaging²
² IMHO instead of developing layers on top of spec/RPM, we should gradually improve our packaging macros and capability of spec/RPM processing/parsing. This is more or less what I said in my FESCo interview when I was elected this term, so I guess at least some poeple agree with this sentiment.

The thing is that certain problems are currently out of reach for rpm. If you e.g. tried to have dynamic changelog and release with rpm today, you would lose ability for srpm rebuilding (because they might and probably will be rebuilt in contexts where the source information for dynamic changelog and release is no longer available). The same goes for working with unpackaged sources. You can pretend that rpm supports it by providing some macros which are in fact interpreted by build system so you basically introduce an rpm macro that rpm itself can't correctly process and it needs the whole build infrastructure around it so that the macro can be computed. This is like a fake rpm support.

Actually, the preprocessing I am suggesting could be integrated directly into rpm (https://github.com/rpm-software-management/rpm/issues/1472) but I think my approach is more gradual and can eventually lead to it. Now might not be the correct time to do it. If rpm devs say it should be done, then they are probably right. But right now I would say, they will either have no opinion or negative opinion, maybe i am wrong, it would be interesting to know their view.

Yet another option is to get rid of srpm format and build rpms directly from dist-git repos. Then you could simply use %() expansion and be happy. But in that case, rpm would lose its source carrier format that many people are used to.

but I've also been following the feedback others gave to the proposal: There is simply not enough buy in. It has been unfortunately shown in the past that even brilliant ideas will fail when there is no community buy in, because only very limited number of people will care to fix/improve/work on this and others will only suffer from (unwanted) negative side effects, spending their time trying to workaround things.

Yes, I agree if accepting this would make most of the current people unhappy to the extent that they don't want to have anything in common with it, then it probably isn't worth it as I can't do the support on my own.

I appreciate that you spent time to try to make packaging easier. However, I sincerely believe this change proposal is not worth it. I am sad that it makes you want to reconsider you future contributions to Fedora, however that alone should not be the reason to "give this a chance".

Sure.

¹ Although I dislike that the SCM source option in Copr uses this without a way to opt out, which makes it hard to build via SCM from distgit compatible sources out of the packager control (the new distgit source option finally remedies this, but is not as powerful as the SCM option). However, this is out of scope here.

Well, I think it wouldn't be that hard to add support for overriding it but I also think the preprocessing enabled didn't cause that much issues - there are two packages out of many many thousands (32681) that have an issue with preprocessing enabled. Anyway, I agree this is a bit out of scope, although related.

Sorry for the long response. I feel like I should try to explain what I tried to do and why so that people can evaluate it properly. I don't want this change to be rejected just because people didn't pay too much attention to it and they take defensive stance by default.

-1

Too many layers. I'm still catching up on this thread, but what has been said so far I agree with. rpmbuild already does preprocessing before build that depends on the environment it's running in. To repeat @churchyard, we should work to improve our current macro definitions, expansion, and documentation to make packaging easier for existing and new maintainers. I do not feel adding another layer of preprocessing that happens in a different environment is the way to do this.

Too many layers. I'm still catching up on this thread, but what has been said so far I agree with. rpmbuild already does preprocessing before build that depends on the environment it's running in.

What kind of preprocessing...What's the syntax for it? Because i am not aware of it.

Too many layers. I'm still catching up on this thread, but what has been said so far I agree with. rpmbuild already does preprocessing before build that depends on the environment it's running in.

What kind of preprocessing...What's the syntax for it? Because i am not aware of it.

if you mean %() - it doesn't change spec file in-place.

Sorry, maybe you know it but I would like to make sure we are all on the same page.

If you use %() in spec file to derive package information from git metadata and build that spec file into srpm - that srpm will not be buildable into rpm (or srpm again).

Try to put into some spec file in dist-git:

Name:  %(basename `git rev-parse --show-toplevel`)

Then execute:

$ fedpkg srpm

Then change the current working directory e.g. to /tmp and execute rpmbuild --rebuild <path to the srpm>. You will get: fatal: not a git repository (or any parent up to mount point /).

So you can't use %() (neither you can't use %{} as it has the same problem) for that purpose because then there will be invalid srpms floating around everywhere. So that's why a macro type that actually modifies the spec file in-place before it is built into srpm is needed. You don't need to understand it as "another layer" necessarily - it is just a new macro type that does actual preprocessing (as I understand it e.g. from C language, i.e. text of the program is changed before it is passed to compiler). I don't see any update on https://rpm.org/user_doc/macros.html page since I started to work on this so I assume there isn't some hidden new feature that would already do this and that you had on mind.

The idea is to support something like you do here: https://github.com/rpminspect/rpminspect/blob/6ae532c1350149cb16522dd00b4a2887ef15e115/rpminspect.spec.in directly in Fedora infrastructure (right now by support in rpkg/mock, maybe later in rpm directly).

Why? Because that will enable us to generate certain spec file parts from git metadata - therefore we can:
- get rid of conflicts on Release and Changelog fields between branches in dist-git (the original motivation)
- offer new workflows that will make packager's work easier as far as handling changes in spec files and patches goes (in certain use-cases)

I am sorry, maybe I am repeating myself and maybe I should write this in the associated mailing list thread but I would really like to avoid situation where people think that this change proposal overlooks something obvious (that they think can be easily used here, which would later proved not to be true).

By preprocessing I mean rpmbuild's macro expansion. You are correct that it does not modify the spec file in place, but I wouldn't expect it to. cpp does not modify my C source files when I compile with gcc. That would be unexpected behavior.

There are definitely limits to rpm's macro expansion and I'm not saying it is perfect. The expectation is that you will be building an RPM in an environment that provides /usr/lib/rpm with data rpm needs to expand macros. On foreign systems that cannot do that, you have the ability to provide your own ~/.rpmmacros file to fill in the gaps.

On the example with my rpminspect.spec.in file, if the end goal is to get certain package meta data from git than I would much rather see a system that just lets me get rid of that line in the spec file altogether. I don't want to automate rewriting that line with metadata from git, I would rather be able to pass the version string on the rpmbuild command line (for example). Now it's no longer a template.

By preprocessing I mean rpmbuild's macro expansion. You are correct that it does not modify the spec file in place, but I wouldn't expect it to. cpp does not modify my C source files when I compile with gcc. That would be unexpected behavior.

The similarity is that cpp has also a separate preprocessing core that only does textual substitutions and doesn't have any relationship (i.e. inter-program communication except handing over the result) with final compiling core. rpkg preprocessing also doesn't modify the actual sources (i.e. the original spec template), it only creates the exported version of it that can be further processed by rpm (the compiling core). The use-cases are different, I used the comparison to highlight difference between %() (part of the "compilation core") and {{{ }}} (part of the "preprocessing core"), I wasn't trying to say that the two use-cases are 1:1 matching. The difference is that cpp preprocessing is never dependent on context (e.g. git metadata surrounding the sources) whereas we need some kind of preprocessing (or processing), which is dependent on context (that's the essence of generating parts of spec from git metadata). This is a bit philosophical debate here, which doesn't mean it's wrong, just that we would need to be really precise to talk about it, analyze the differences and understand each other.

There are definitely limits to rpm's macro expansion and I'm not saying it is perfect. The expectation is that you will be building an RPM in an environment that provides /usr/lib/rpm with data rpm needs to expand macros. On foreign systems that cannot do that, you have the ability to provide your own ~/.rpmmacros file to fill in the gaps.

Yes, this might be a potential shortcoming of the approach I am suggesting but it's also quite theoretical one. I.e. I would like to see a practical example of the system somewhere which consumes Fedora spec files directly from dist-git and is not able to import the new tooling.

Once you need to do some manual changes in the system like editing ~/.rpmmacros, you really also can go to the spec and change it manually, yes, it's more work but eventually you can just import the new tooling.

On the example with my rpminspect.spec.in file, if the end goal is to get certain package meta data from git than I would much rather see a system that just lets me get rid of that line in the spec file altogether. I don't want to automate rewriting that line with metadata from git, I would rather be able to pass the version string on the rpmbuild command line (for example). Now it's no longer a template.

That's sounds nice in theory but I don't think it is possible to just feed rpmbuild with a bunch of variables and let it spit the final spec without any input text file. a) it would be horrible to generate that command-line, b) you might end up with having to deal with bunch of manually maintained text snippets (i.e. for %check section, %prep section, %build section) which you then need to pass as input params. What you suggest would be only convenient if you could really actually automate generation of the whole spec file without a need for packager's input. But I don't think that is possible. There is also no in-between step, either you have full generation and it works or you don't and it's a total pain.

Maybe I am misunderstanding what you were suggesting. Feel free to give me a more concrete description.

Actually...if you meant that you have a standard spec with some %name, %version, etc macros that you provide from outside by using --define (that's another option how to interpret what you said), then this again hits the problem that this is not compatible with source-rpm existence (or even source-tarball) existence because you cannot replicate those --defines again later when you built an rpm (or another srpm). I personally think, rpm should maintain its own source format, that's why I came with what I came with. If there is some plan to ditch srpms, then ok, this would work.


I am happy to hear these arguments though. I am less happy that people didn't argue like this first without giving a verdict. Then e.g. also rpm devs could join and say something.

Actually...if you meant that you have a standard spec with some %name, %version, etc macros that you provide from outside by using --define (that's another option how to interpret what you said), then this again hits the problem that this is not compatible with source-rpm existence (or even source-tarball) existence because you cannot replicate those --defines again later when you built an rpm (or another srpm). I personally think, rpm should maintain its own source format, that's why I came with what I came with. If there is some plan to ditch srpms, then ok, this would work.

...Maybe rpm could add support to maintain those defines in a file in srpm alongside the spec-file (e.g. --persist-define) but this is still not the solution because it's not enough to just store %version, %name, etc. because you need to maintain an option for customization of the generation of individual fields from within the spec file (even rpmautospec does it) so you would essentially need to copy-in basically the whole .git (or a large portion of it). I was thinking through these scenarios before, now i just remembered this.

Well, I know that RPM today already stores compiler flags in the SRPM (even though nobody uses that for anything). It would be reasonable for some mechanism to capture input macros into the SRPM header for reproducing all the inputs of the build environment.

Well, I know that RPM today already stores compiler flags in the SRPM (even though nobody uses that for anything). It would be reasonable for some mechanism to capture input macros into the SRPM header for reproducing all the inputs of the build environment.

%macro syntax has a certain meaning today (i.e. to evaluate again and again upon rebuilding) and that should be maintained so I think you might need to add some --replay flag for rpmbuild to get that behavior of evaluation of spec "from the srpm cache". The question is whether this flag is or isn't a bigger nuisance than what I am proposing (because there I offer an explicit syntax of macro evaluations that should be maintained over rebuilds vs. those that shouldn't and packager determines it when constructing the spec file, which might be more correct than someone switching that flag on and off later). But this is interesting...

== Spec-file canonicity and provenpackager workflows

I agree with @mhroncok that this is an issue: the addition of the preprocessing step means that spec files are not cannonical any more, complicating provenpackager and releng workflows. It doesn't really matter how complicated the step is. Every use case that could just look at the .spec file, now needs to have two paths. It is possible that in various scenarios the .spec.rpkg file could be used without preprocessing (because the interesting parts don't use the rpkg macros), but we can't know that, since the macros in principle could be used for anything. This creates a barrier to any kind of automated munging.

Example: a few years ago we renamed python2 subpackages [1, 2] also using a semi-automatic script that modified the spec files using some heuristics. With .spec.rpkg, we would have three not very attractive choices:

  • ignore the rpkg macros (and just use .spec.rpkg in place of .spec). This would probably work, but only as long as the rpkg macros are not used too much.
  • preprocess the .rpkg file to .spec, but then when the spec file is modified, somehow reverse the preprocessing to apply the change to the unprocessed .rpkg file. This is impossible to do in the general case.
  • implement a separate workflow that implements equivalent munging for rpkg files.

Another example: the all-specs-seed tarball that Fedora provides [3]. Not an unsurmountable complication, but suddently you can't have "all spec files in Fedora".

[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/6AIJ6CYUYXACKR4TJH2ATY2JYWSHPRSG/
[2] https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org/thread/DQ4VMCKFO7I5ZVRDTGEMISG6LG7P7BM7/
[3] I lost the link again. Is it documented somewhere?

== Complexity

Two lessons from introduction of Modularity:

a) it seems that complexity can be injected into the system in localized fashion, but in fact there are so many connections between different packages that this additional functionality doesn't just affect the place where it is used, but has a ripple effect across other packages and packagers.

b) once added, it is very hard to remove complexity. (There are essentially no modules in Fedora, but dnf, koji, bodhi, dist-git, releng renaming branches, every repoquery, etc, have to support it, systems have the repos enabled, packagers need to be aware.)

In theory the proposed rpkg preprocessing is opt-in, but even packagers who only work on "leaf" packages, sometimes need to do local rebuilds for testing and such. This means that pretty much everyone would have to be aware of the rpkg functionality and include it in their workflows. In other words, if we decide to enable rpkg, we're not just enabling it for some packages, we're enabling it for all of Fedora, for the foreseeable future.

And complexity affects not only tools, but also humans. An extra step in the workflow is something that makes the system harder to grasp and use, especially for new packagers. The bar for changes to the fundamental packaging workflow needs to be high.

== Alternatives

The proposals to add new packaging workflows have not had an easy life ;)
But there are some cases where it worked properly: %generate_build_requires [4] is significant new functionality that was added and hasn't caused much discussion since despite being widely used. The process has been painfully slow (ticket opened in 2016, but I think it was discussed even earlier, implementation in 2019, use in Fedora even later than that). IMO, the reasons for success are:
a) the feature is simple (rpmbuild calls a script, the script writes text to output, this output is used for one purpose and one purpose only), so it's easy to explain.
b) the feature is part of rpm. We don't have to worry about downstreams having issues with our srpms, or support in other tools.

And I think this is the template that we should follow also for things like in this Change: enhance rpm and rpmbuild themselves whenever that is the architecturaly best place for changes. This is much more painful and slow, but gives better results in the long run.

[4] https://github.com/rpm-software-management/rpm/issues/104

-0 for now. The costs are pretty clear, and the functionality we gain doesn't seem imporant enough.

Sorry for the long rambling, I hope it's at least better than a curt reply ;)

The similarity is that cpp has also a separate preprocessing core that only does textual substitutions and doesn't have any relationship (i.e. inter-program communication except handing over the result) with final compiling core.

The mention of cpp is interesting. The relation between cpp preprocessor and the c or c++ compiler is interesting. cpp doesn't "speak" the same language as the compiler, so the programmer needs to learn both. And because the preprocessor doesn't understand what it is processing, it is both needlessly complicated and functionally limited. (A good example is the contortions that we need to go through to have nested side-effect-free arithmetic macros like max and min.) This approach made sense back in the day, but newer languages take different routes. Rust also has macros, but they are part of language itself. This is harder to pull off, but the effect is worth the trouble.

Another example: the all-specs-seed tarball that Fedora provides [3]. Not an unsurmountable complication, but suddently you can't have "all spec files in Fedora".

[3] I lost the link again. Is it documented somewhere?

https://src.fedoraproject.org/lookaside/
And more precisely: https://src.fedoraproject.org/lookaside/git-seed-latest.tar.xz

  • Sorry I saw Miro's answer afterward (and his link is the correct one, mine points to the tarball containing all the git repos, not just the spec files)

Hello Zbyszek! Thank you very much for the response.

== Spec-file canonicity and provenpackager workflows

I agree with @mhroncok that this is an issue: the addition of the preprocessing step means that spec files are not cannonical any more, complicating provenpackager and releng workflows.

Just to be sure, what do you mean by spec file canonicity? Do you mean the fact that they are not parseable by rpm directly? There is also an alternative interpretation of spec file being a canonical source for rpm package properties (which is not the case even today).

It doesn't really matter how complicated the step is. Every use case that could just look at the .spec file, now needs to have two paths. It is possible that in various scenarios the .spec.rpkg file could be used without preprocessing (because the interesting parts don't use the rpkg macros), but we can't know that, since the macros in principle could be used for anything. This creates a barrier to any kind of automated munging.

In principle, they could be used for anything but it's likely that there will be a predefined set of use-cases for which they can be used (and they could be used only for those) inside Fedora specs. For example, the rule can be that if a certain spec file snippet is dependent on something which is not included in srpm (i.e. something not contained in source archive or patch, typically git metadata), then that's a suitable use-case for {{{ }}} as an alternative to plain text.

Example: a few years ago we renamed python2 subpackages [1, 2] also using a semi-automatic script that modified the spec files using some heuristics. With .spec.rpkg, we would have three not very attractive choices:

  • ignore the rpkg macros (and just use .spec.rpkg in place of .spec). This would probably work, but only as long as the rpkg macros are not used too much.
  • preprocess the .rpkg file to .spec, but then when the spec file is modified, somehow reverse the preprocessing to apply the change to the unprocessed .rpkg file. This is impossible to do in the general case.
  • implement a separate workflow that implements equivalent munging for rpkg files.

I think you can just always use option number 1. The key is to have a defined set of use-cases for which rpkg preprocessing can be used. Also, content of {{{ }}} can be limited to a predefined set of macros (bash functions) instead of allowing any bash code.

Generally, parsing spec file by regular expressions is wrong but I don't think this feature makes it any worse than it is (at least if the used regulars expressions are to the point and guidelines are being followed).

Another example: the all-specs-seed tarball that Fedora provides [3]. Not an unsurmountable complication, but suddently you can't have "all spec files in Fedora".

Well, you can have an archive of all (extended) spec files in Fedora.

[1] https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/6AIJ6CYUYXACKR4TJH2ATY2JYWSHPRSG/
[2] https://lists.fedoraproject.org/archives/list/devel-announce@lists.fedoraproject.org/thread/DQ4VMCKFO7I5ZVRDTGEMISG6LG7P7BM7/
[3] I lost the link again. Is it documented somewhere?

== Complexity

Two lessons from introduction of Modularity:

a) it seems that complexity can be injected into the system in localized fashion, but in fact there are so many connections between different packages that this additional functionality doesn't just affect the place where it is used, but has a ripple effect across other packages and packagers.

b) once added, it is very hard to remove complexity. (There are essentially no modules in Fedora, but dnf, koji, bodhi, dist-git, releng renaming branches, every repoquery, etc, have to support it, systems have the repos enabled, packagers need to be aware.)

In theory the proposed rpkg preprocessing is opt-in, but even packagers who only work on "leaf" packages, sometimes need to do local rebuilds for testing and such. This means that pretty much everyone would have to be aware of the rpkg functionality and include it in their workflows. In other words, if we decide to enable rpkg, we're not just enabling it for some packages, we're enabling it for all of Fedora, for the foreseeable future.

Yes, people should be aware of it but unless someone insists on building (s)rpms from dist-git repos by bare rpmbuild, it's not that difficult to adjust their workflow.

And complexity affects not only tools, but also humans. An extra step in the workflow is something that makes the system harder to grasp and use, especially for new packagers. The bar for changes to the fundamental packaging workflow needs to be high.

I think people already understand the concept of templating/preprocessing from other areas so it shouldn't be a problem to quickly get familiar with the application here. Also it is often enough to understand what-it-does (generates changelog and release automatically) instead of what-it-is and how-it-works. The suggested preprocessing doesn't have any side-effects further in build system (big advatange in my opinion) so you can just say:

  • use Release: {{{ git_dir_release }}} (or git_dir_release_branched inside the braces depending on our agreement) to get automatically generated release
  • use {{{ git_dir_changelog }}} to get automatically generated %changelog content from git tag annotations
  • use fedpkg to work with such a spec file in Fedora DistGit

...and it pretty much captures the whole picture (as for automatic changelog and release).

== Alternatives

The proposals to add new packaging workflows have not had an easy life ;)
But there are some cases where it worked properly: %generate_build_requires [4] is significant new functionality that was added and hasn't caused much discussion since despite being widely used. The process has been painfully slow (ticket opened in 2016, but I think it was discussed even earlier, implementation in 2019, use in Fedora even later than that). IMO, the reasons for success are:
a) the feature is simple (rpmbuild calls a script, the script writes text to output, this output is used for one purpose and one purpose only), so it's easy to explain.

That's pretty much how you could explain rpkg macros from this proposal also (except that in this proposal the invocations happen outside of rpmbuild).

But you say the feature is simple while it's not that simple. To get a list of generated of build requires for a package, you need to execute rpmbuild -br which generates buildreqs.nosrc.rpm and then you need to read Requires field from the header of that nosrc.rpm. Additionally, you might need to do it multiple times in some cases to actually get all the build requires (https://github.com/rpm-software-management/mock/issues/276 but I think this requirement could be dropped in theory). So it's really quite hard to actually get the list of build requires for a certain spec file. dnf builddep doesn't really work on those spec files properly (it will just install static build requires and not the dynamic ones). And rpmbuild -bs generates an srpm which contains metadata header field (Requires) with basically incomplete information.

I would say this feature stretches rpm capabilities quite far. But I am not saying i have a better solution. Maybe these issues are relatively minor and the solution is as optimized as it can be possibly be. It's good to have something that works for people.

b) the feature is part of rpm. We don't have to worry about downstreams having issues with our srpms, or support in other tools.

While the feature is part of rpm, it required support in mock for correct functioning so if e.g. rust packages (that use dynamic buildrequires extensively) are being ported into OBS, it's likely that support needed to be added there as well (and in other environments also, which would use those spec files and which don't use mock). So we should had been thinking about support in other tools.

Also we should had been thinking about downstream having issues with our srpms as they now no longer contain full list of build requires in their header as they did before.

And I think this is the template that we should follow also for things like in this Change: enhance rpm and rpmbuild themselves whenever that is the architecturaly best place for changes. This is much more painful and slow, but gives better results in the long run.

I think the %dynamic_buildrequires feature and this proposal aren't directly comparable for the reason that you cannot generate valid/buildable srpms by rpmbuild if spec file depends on the original context where it is placed. That is not the case for the %dynamic_buildrequires feature as there, the data from source archive is used.

It doesn't, of course, mean that the feature cannot be in rpm, just that it's a different case here.

[4] https://github.com/rpm-software-management/rpm/issues/104

-0 for now. The costs are pretty clear, and the functionality we gain doesn't seem imporant enough.

There were many threads in past that requested or talked about this feature (in the change proposal, I have collected 13 but real number will be probably higher). There were talks about this at Flock and DevConf. There are now three proposals trying to solve that particular problem of auto-generating changelog and release and therefore also resolving conflicts between branches. If what you are saying is true, it would mean that people are constantly coming with (and wanting to solve) unimportant things. It's hard to believe that's the case.

In addition, this proposal also offers solution to other problems mentioned in other threads (e.g. making work with high volume of patches easier or building from upstream directly in Copr by using original dist-git spec file). Historically, I also drew inspiration from some dist-git feature requests (e.g. this comment here: https://github.com/release-engineering/dist-git/issues/1#issuecomment-103183017 and related comments about gnome-CI - the original feature request is not important here) and from discussions with my RH colleagues back then.

My view (based on the data) is that the functionality will be useful for people, therefore it is important in my opinion.

Sorry for the long rambling, I hope it's at least better than a curt reply ;)

The similarity is that cpp has also a separate preprocessing core that only does textual substitutions and doesn't have any relationship (i.e. inter-program communication except handing over the result) with final compiling core.

The mention of cpp is interesting. The relation between cpp preprocessor and the c or c++ compiler is interesting. cpp doesn't "speak" the same language as the compiler, so the programmer needs to learn both. And because the preprocessor doesn't understand what it is processing, it is both needlessly complicated and functionally limited. (A good example is the contortions that we need to go through to have nested side-effect-free arithmetic macros like max and min.) This approach made sense back in the day, but newer languages take different routes. Rust also has macros, but they are part of language itself. This is harder to pull off, but the effect is worth the trouble.

I am not an expert here but I think comparing these two things can be misleading. Rust is a general purpose language whereas rpmspec is a packaging language. So I don't think you can say that if Rust is doing something in a particular way, it should be done the same way in rpm.

AFAIK, Rust macros are extended version of C macros in the sense that they do AST manipulations instead of just plain (unparsed) text manipulations but I don't think this is something rpm would need to do. C/C++ doesn't need it as well and these are general purpose languages.

But I think you wanted to suggest that the feature should be directly in rpm. I tentatively agree but as far as implementation in this proposal goes, I think it can function well outside of rpm and then be ported into rpm if people think it is a good idea (i don't think it is completely clear at this point). I would much prefer this organic development for the implementation suggested here. In other words, I am not too excited to keep making this more and more optimized without this being actually used (because usage brings a valuable feedback).

But I think you wanted to suggest that the feature should be directly in rpm. I tentatively agree but as far as implementation in this proposal goes, I think it can function well outside of rpm and then be ported into rpm if people think it is a good idea (i don't think it is completely clear at this point). I would much prefer this organic development for the implementation suggested here. In other words, I am not too excited to keep making this more and more optimized without this being actually used (because usage brings a valuable feedback).

Of course, if I got some connection with rpm devs and they would support it, probably that could be a different story but I still think this proposal is fine as it is and doesn't really need to be pushed any further at this point.

  • use Release: {{{ git_dir_release }}} (or git_dir_release_branched inside the braces depending on our agreement) to get automatically generated release
  • use {{{ git_dir_changelog }}} to get automatically generated %changelog content from git tag annotations
  • use fedpkg to work with such a spec file in Fedora DistGit

I would like to be honest here. There is one more point. You should also use:

Name: {{{ git_dir_name }}}

otherwise you would need to specify name= argument for the following invocations of git_dir_release and git_dir_changelog macros (it's documented here: https://docs.pagure.org/rpkg-util/v3/macro_reference.html)

This proposal was rejected by FESCo in today's meeting.

Metadata Update from @bcotton:
- Issue untagged with: meeting
- Issue tagged with: pending announcement

3 years ago

We discussed this during today's FESCo meeting:
REJECTED (0, ±1, -7)
Two votes were cast as "for 34", the others without such qualification.

Announced https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/KOC5PRFMP6OGWD36JNP276YTZKISAUY4/ .

Metadata Update from @zbyszek:
- Issue untagged with: pending announcement
- Issue close_status updated to: Rejected
- Issue status updated to: Closed (was: Open)

3 years ago

Log in to comment on this ticket.

Metadata