#72 Make it clear how to query module metadata for a yum/dnf repo
Closed: Fixed 2 days ago by asamalik. Opened 2 years ago by ncoghlan.

(Replacing https://pagure.io/fm-orchestrator/issue/717)

It's currently unclear how tools other than DNF (e.g. fedmod's modulemd generator) should be querying repo metadata to determine:

  1. What modules are defined by the repo
  2. Which RPMs are exported by those modules
  3. Which dependencies those exported RPMs are able to satisfy

This is a tracking issue to cover:

  1. Documenting the current repo-level metadata format that DNF itself uses
  2. Asking whether we want to make any changes to that format to better support certain kinds of queries
  3. Asking whether we want to add either a new DNF modulequery plugin, or else extend the existing repoquery plugin, to better support working with repo-level module metadata

Nick, you are my hero. :green_heart:

@ttomecek I'm currently trying to update https://pagure.io/modularity/fedmod to do this right, so I have a certain vested interest in the matter ;)

Bringing the most important discovery from https://pagure.io/modularity/fedmod/issue/9 over: the repository level module metadata format is just a gzipped concatenated stream of all of the individual modulemd files included when creating the repo.

The relevant modulemd function is https://modulemd.readthedocs.io/en/latest/modulemd.html#modulemd.loads_all

Summarising what I learned in writing https://pagure.io/modularity/fedmod/pull-request/11:

  1. Start from repomd.xml as usual for repo metadata
  2. Extract the relevant relative location using the XPATH query rpm:data[@type='modules']/rpm:location, where the RPM xml namespace is set to http://linux.duke.edu/metadata/repo
  3. Read that file using gzip and modulemd.loads_all

Some example code for use with metadata stored locally:

import gzip
import modulemd
from lxml import etree

repomd_fname = os.path.join(metadata_dir, "repodata", "repomd.xml")
repomd_xml = etree.parse(repomd_fname)
repo_modulemd = repomd_xml.find("rpm:data[@type='modules']/rpm:location", {"rpm": "http://linux.duke.edu/metadata/repo"})
if repo_modulemd is None:
    raise RuntimeError("No 'modules' entry found in repomd.xml. Is the metadata for a non-modular repo?")
repo_modulemd_fname = os.path.join(metadata_dir, repo_modulemd.attrib["href"])
with gzip.open(repo_modulemd_fname, "r") as modules_yaml_gz:
    modules_yaml = modules_yaml_gz.read()
modules = modulemd.loads_all(modules_yaml)

The current way to do this is to use the modulemd v2.

Metadata Update from @asamalik:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 days ago

Login to comment on this ticket.