#715 Separately building package documentation
Closed: accepted 5 years ago Opened 6 years ago by tibbs.

I wanted to bring up this issue from a discussion/session I with @psabata at Flock.

Basically, there are packages in the distribution with a problematic property: If you want to build just the software, the dependency tree is small. But if you want to build the software and its documentation, the dependency tree includes thousands of packages.

One of the things discussed is finding the minimal set of packages which are required to build all of the packages in the set, which I believe was called the "minimal self-hosting set". For the modularity effort, as well as bootstrapping and basically just general sanity, it is beneficial to minimize that set. (Currently I think it's something over 3000 packages which is... a lot.)

I believe it was mentioned that the set could be cut in half if you didn't have to build the documentation for things like coreutils (which needs texlive) in order to get /usr/bin/ls and friends. There was some talk about how best to do this and I suggested that the most reasonable way was just to have one package that builds the executables and another package which builds the documentation.

Pros of this approach:
It's really simple.
It's can be implemented immediately, with no need for RPM macros or hacking.
It's not forbidden by packaging guidelines as far as I am aware.
I couldn't in an hour come up with any other reasonable way to do it that didn't require more nastiness than I was willing to consider.
* If the documentation is large or takes ages to generate (looking at you, ansible) then you can do minor bumps to the package without having to regenerate the documentation.

Cons:
You now have two packages which you have to keep synchronized.
It just seems weird (to me, at least) to have two source packages that use the same sources but different specfiles.
* Some packages might not let you build just the documentation without hacking their build systems, which kind of wastes a bit of time building the executables twice.

I think it's OK (according to the guidelines) to do this today, but it might be reasonable to mention it explicitly (and I'll draft something if folks agree). On the other hand, if folks disagree or have other ideas for accomplishing this, then please speak up.


Metadata Update from @tibbs:
- Issue tagged with: meeting

6 years ago

I definitely agree that this is good idea which should make builds way faster.. Also bootstrapping is way easier. BUT: I don't really like idea of having 2 different source packages. I think we need some automation here. E.g. we could have some macro in spec which would build just documentation and koji would build that one first, extract docs and when we are building without docs (the second build) would include those docs directly.

The exact mechanism needs to be though, but as I said -- I fully support the initiative!

The problem is that I can think of no macro which would do this. You would have one spec file generating two completely different packages.

The idea isn't just to have a "doc-less" build with you pass a flag. It's to not have the dependencies exist in the package at all, and to allow the documentation package to move at a different rate than the package containing the software. To try to cram all of that into one source package and then teach the buildsystem about it seems to be rather the wrong way to accomplish it.

I concur with @ignatenkobrain . This is IMO use case for bootstraping guidelines [1] or something similar ....

What kind of documentation would this separation apply? Manual pages? Do we want to have manual pages in a separate package? Will we have dependencies between these two packages? How tight dependencies?

If we make put documentation into a separate package, it won't be build-time problem anymore, it will become run-time problem.

I don't like hacking packages ad hoc. That leads to inconsistencies and terrible user experience. Without solid support from RPM/DNF for optional features, i.e. I want documentation, I do not want documentation, I want an exception for this one package, it's difficult to yield any consistent packager and user experience. (Where are the rich dependencies that were supposed to solve it. I remember glibc want to use the for installing langpacks. It's the very similar use case, yet not resolved.)

There was some talk about how best to do this and I suggested that the most
reasonable way was just to have one package that builds the executables and
another package which builds the documentation.

Please don't concentrate on documentation only, but also take into account large
test-suites (where set(TestRequires) >= set(BuildRequires)). More to this particular
proposal -- I can not imagine cutting out the documentation/testing stuff into separate
packages downstream only, because it would make the maintenance much, much
more expensive, and as such it would be contrary to the modularity main promise
(lowering maintenance costs).

-1 I am having problems to imagine any useful use-case of this proposal.

Thing is, this isn't about bootstrapping.

The useful use-case was detailed in my initial message.

In any case, FPC can of course simply ignore the issue, since nothing in it is banned currently so interested packagers can simply do this now. We can simply not have any mention of it in the guidelines and people can go off and implement it in whatever random way they like. I personally think it would be better to offer guidance, but if there's no consensus on what guidance to offer then so be it.

Having bootstrapped Fedora on RISC-V last year I can say that yes documentation is a problem. However it was a one-off* task to comment out complex documentation (and other complex dep) bits of spec files and build interim packages which can then be used to build other packages and eventually to build full packages. And the problem wasn't just documentation anyway.

So, for bootstrapping, don't worry about this. (For modularity the arguments might still apply)

[*] I say "one-off" but in fact we're just about to re-bootstrap because of incompatible kernel & glibc changes. Thankfully it'll be a bit easier this time since we've already been through it once.

Just for fun, I'll say it again: This isn't about bootstrapping. It's explicitly about docs and only docs. It is not related to test suites.

So, Rathann proposed the following in the meeting:

If building documentation requires many additional dependencies then you may choose to create a separate src.rpm package just for building documentation independently.

And to complete it, I propose the following as the complete text to add to https://fedoraproject.org/wiki/Packaging:Guidelines#Documentation:

If building documentation requires many additional dependencies then you MAY elect to not build it in the main package and instead create a separate *-doc source package which builds only the documentation. This separately packaged documentation MUST correspond to the version of the packaged software.

I'm trying to avoid giving anyone the impression that they have to do a pointless bump of the doc package whenever the main package changes while still requiring that they keep them synchronized in the way we expect, where the docs package actually contains the documentation for the version of the software in the main package.

Naming of the separate package is covered earlier in the section, though I thought it best to reiterate.

Implicit in "separate source package" is the whole review process thing. We could choose to exempt split documentation packages from the review process, if we wanted.

Do we need to say what the structure of the -doc package is? Should we add something about redefining %_pkgdocdir (so that you can split off docs and not have them change location)?

We discussed this at this weeks meeting (https://meetbot-raw.fedoraproject.org/fedora-meeting-1/2017-10-26/fpc.2017-10-26-16.00.txt):

Just for fun, I'll say it again: This isn't about bootstrapping. It's explicitly about docs and only docs. It is not related to test suites.
So, Rathann proposed the following in the meeting:

If building documentation requires many additional dependencies then you may choose to create a separate src.rpm package just for building documentation independently.

And to complete it, I propose the following as the complete text to add to https://fedoraproject.org/wiki/Packaging:Guidelines#Documentation:

If building documentation requires many additional dependencies then you MAY elect to not build it in the main package and instead create a separate *-doc source package which builds only the documentation. This separately packaged documentation MUST correspond to the version of the packaged software.

Just to be clear, this should mean a new package that build docs with sphinx or latex will requiere two Fedora Review? Any to way to get the packagas reviewd and aproved in the same request?

I just learned about another related issue. For example, if we want to put git-lfs into SCL [1] (yeah, SCL are not in Fedora but that does not mean they don't exist), there needs to be available ronn [2] to build the man pages. However, ronn needs Ruby etc. The question here is how to have the man pages but not necessarily ronn available during build.

It seems that various SCLs have opted to give more trust into packager's hands, e.g. build the manpages/documentation locally, and ship it as additional source.

This also verges on the symmetry with "rpm -i --excludedocs PKG" on the consumer side.

Updated proposal from today's discussion

If building documentation requires many additional dependencies then you MAY elect to not build it in the main package and instead create a separate *-doc source package which builds only the documentation. This separately packaged documentation MUST correspond to the version of the packaged software. In other words, if a new release of the software includes changes to the documentation, then the documentation package MUST also be updated. But if the new version of the software does not include documentation changes, then you MAY choose not to update the documentation package.

Even longer:

If building documentation requires many additional dependencies then you MAY elect to not build it in the main package and instead create a separate *-doc source package which builds only the documentation. This separately packaged documentation MUST correspond to the version of the packaged software. In other words, if a new release of the software includes changes to the documentation, then the documentation package MUST also be updated. But if the new version of the software does not include documentation changes, then you MAY choose not to update the documentation package. A comment SHOULD be added next to the Version tag of the software package to remind others to bump the doc package as well if needed.

We discussed this at this weeks meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2018-04-12/fpc.2018-04-12-16.00.txt):

  • x715 Separately building package documentation (geppetto, 16:40:42)
  • LINK: https://pagure.io/packaging-committee/issue/715#comment-505572
    (mhroncok, 16:58:49)
  • ACTION: tibbs to paste bits of test agreed on together in the
    writeup. (geppetto, 17:06:51)
  • ACTION: Agreed on some wording changes for building docs packages
    separately (+1:5, 0:0, -1:0) (geppetto, 17:07:24)

Metadata Update from @james:
- Issue untagged with: meeting
- Issue tagged with: writeup

5 years ago

Announcement text:

The Documentation section of the main guidelines was expanded to include information about reducing build dependencies by building documentation in a separate source package.

Metadata Update from @tibbs:
- Issue untagged with: writeup
- Issue assigned to tibbs
- Issue tagged with: announce

5 years ago

Metadata Update from @tibbs:
- Issue untagged with: announce
- Issue close_status updated to: accepted
- Issue status updated to: Closed (was: Open)

5 years ago

Login to comment on this ticket.

Metadata