#52 Bootstrapping problems with _init_repo_bases in module dependency analysis
Closed 6 years ago Opened 6 years ago by ncoghlan.

Attempting to run the tests locally for me currently fails for me with:

WARNING:root:Getting module information from mbs cache. Please run mbs-build local with dependency modules you are interested in before running this script.

This stems from https://pagure.io/modularity/modularity-tools/blob/master/f/modularity/module_deps_differ.py#_92, which does assorted complex things to try to get useful dependency info when even F26 Boltron wasn't really available yet.

Rather than trying to fix it directly, this area of the code is likely what we should be trying to replace with depchase (or at least concepts from that): https://github.com/fedora-modularity/depchase

Thoughts, @asamalik @nphilipp?


I filed https://github.com/fedora-modularity/depchase/issues/14 with depchase about potentially sharing the dependency chasing code between the two projects, but @ignatenkobrain isn't currently keen on that approach.

In the meantime, I'm going to work on https://pagure.io/modularity/modularity-tools/issue/49 to make the project's naming scheme less collision prone, and to reserve the name on PyPI.

Metadata Update from @ncoghlan:
- Issue assigned to ncoghlan

6 years ago

Since Igor would genuinely prefer not to offer a Python API for depchase, I'll instead pull that functionality directly into fedmod (which depchase may or may not be updated to rely on as a dependency at a later date).

Does it mean that you'll pull the code from depchase and fit it into fedmod?

Yeah, essentially fedmod will become both the library and the supported CLI, while depchase remains an internal tool.

Although that said, in my current draft, I've brought depchase in as fedmod._depchase, and I'm thinking I'll add an underscore to the other Python submodules as well, to make it completely clear that there's no stable public API as yet.

@asamalik @dhodovsk While considering how to automate downloading and managing the dependency metadata for depchase to run libsolv queries against, it occurred to me that Fedora already has a tool for managing that task: mock.

And given https://fedoraproject.org/wiki/Using_the_Koji_build_system#Using_koji_to_generate_a_mock_config_to_replicate_a_buildroot, relying on mock would also mean gaining a significant level of build system integration almost for free.

So I'm leaning towards having fedmod work as follows:

  1. By default, it just uses the repo metadata for the running system
  2. If you want it to use something else, either give it a mock config to use, or else run it in the mock chroot

https://pagure.io/modularity/modularity-tools/issue/48 (requesting --enablerepo/--disablerepo support) is another example of an RFE that could be resolved by using mock as the mechanism for repo metadata configuration.

As I experiment further with a mock-based approach to this problem, I'm thinking that we may actually be better off querying the DNF metadata via the CLI rather than the Python API. My rationale for that is that it will mean that:

  1. We get to use the --pm-cmd option to mock (so we can easily switch which metadata we're resolving against)
  2. We get to use the existing dnf repoquery and dnf builddep commands, rather than having to reimplement their logic
  3. It creates a natural internal boundary within fedmod where the metadata query execution could be offloaded to an external service (since the rest of the client won't care about the difference between "run a subprocess locally" and "submit a query to a remote REST API")

There's something I don't understand happening where dnf is interacting unexpectedly with mock's --offline option: https://ask.fedoraproject.org/en/question/111687/how-do-i-combine-mocks-offline-option-with-dnf/

However, while that's an annoying performance problem (since it means I can't pass the "--offline" option to skip the metadata downloads), it doesn't break the actual functionality I want to access, so I'm going to continue heading down this path.

@ncoghlan I would recommend you to not use dnf for dependency resolving here. Before I rewrote depchase, it was taking ~30 minutes to generate modulemd for self-hosting platform set while now it takes less than 2 seconds.

@ignatenkobrain While that's good info, I'd still strongly prefer to avoid having to reimplement (and hence maintain indefinitely) the logic for features that are nominally already provided by DNF (i.e. repoquery and builddep).

If those have performance problems that make them unusable for this kind of automation in practice, then we need to apply the pressure needed to get the performance problems fixed, rather than leaving the supported customer facing tools in an inadequate state while we go off and do our own thing.

Would it make sense to enhance repoquery with module support?

Thinking further about @ignatenkobrain's feedback, I'm thinking it might make sense to break up fedmod's architectural goals into two distinct phases:

  1. Make it work the same way depchase does, without worrying too much about the risks of future divergence from dnf's own dependency resolution algorithms. This iteration would rely on mock config names to specify which repo metadata to query (when not using the metadata for the current system), but wouldn't use mock & dnf to perform any queries

  2. Decouple the metadata querying from the client application itself by enhancing DNF's native ability to efficiently perform the kinds of queries that fedmod needs (on the assumption that if they're useful for fedmod, they'll be useful for other tools as well), and then switching to calling DNF as a CLI, rather than as a Python API.

Replacement issues in in the fedmod repo:

Metadata Update from @ncoghlan:
- Issue status updated to: Closed (was: Open)

6 years ago

Login to comment on this ticket.

Metadata