Issue #480: [modularity] adjust pkgset and gather phases to handle multiple inputs - pungi

pungi

#480 [modularity] adjust pkgset and gather phases to handle multiple inputs

Closed: Fixed 7 years ago Opened 7 years ago by ralph.

This is for https://fedoraproject.org/wiki/Changes/ModularCompose

Currently, pungi pulls in rpms in the pkgset and gather phases from one tag in koji (or from one set of arch-specific repos locally). All of the variants are built from this same common pool.

For modularity, we'd like to be able to use multiple, distinct package sets in the same compose. This way, F26 Server, Workstation, and Cloud can be built from the F26 tag, but the ModularServer edition can be built from a generational core tag built by the base-runtime team using the module-build-service.

It looks like the phases currently only support one pool of input rpms (one package set), so I'd to look at what it would take to allow multiple named package sets. If too many of the other phases depend on assuming that there is only one package set, this may not be feasible. If it is not feasible, then we can try running multiple composes for each modular variant, but that introduces more overhead on RCM/releng engineers.. and we don't want to do that. If we can successfully allow multiple package sets per compose, then we (hopefully) won't hand off any increase in workload to RCM/releng.

lsedlar commented 7 years ago

Here's a high level overview: there is the PkgsetPhase that pulls information about available packages. It can source them from Koji or yum repos (which is not used very much as far as I know). The main result of the pkgset phase is to create an in-memory lists of available packages for each architecture as well as a temporary yum repo with the packages (one per each architecture plus a global with all the packages).

The gather phase then works with these to run dependency solving and figure out what packages are supposed to go into which variant. Once these lists are finished, the createrepo phase creates the actual repositories in compose/ subdirectory that will be shipped.

There is currently an assumption that all variants are using the same package set. An argument could be made that if you have different content in different variants, you essentially have different products and should run different composes.

Apart from the two phases already mentioned, there are other users of the package set outputs: buildinstall phase is using the temporary repos as input for lorax. This could in theory be replaced with the repo created for a particular variant, but it would make the compose a lot slower as it could no longer be done in parallel.

The createrepo phase is using the global temporary repo to speed up generating metadata (I think). This could just point to the particular named package set.

The productimg phase (which modifies the boot.iso) uses the in-memory package mapping to look up anaconda package as it needs some files from it. It's not used in Fedora at all.

jkaluza commented 7 years ago

Thanks for this comment, it's helpful.

jkaluza commented 7 years ago

There is currently an assumption that all variants are using the same package set. An argument could be made that if you have different content in different variants, you essentially have different products and should run different composes.

I have to double-check that with Ralph, but my understanding is that we will produce ModularServer from multiple modules and therefore we will need multiple input repositories for it anyway even if it is standalone product, because one module is one repo (currently).

Edited 7 years ago by jkaluza