README.md

Welcome to depict, a program intended as an aid for distribution packagers who have to deal with cycles in dependency graphs. Depict has a set of plugins that can read dependency information from a variety of sources. The user specifies a set of packages of interest, and depict computes the smallest dependency graph containing those packages. The user can also specify packages that are not of interest; these (and their dependencies) are omitted from the graph. The resulting dependency graph can be output in several useful formats. The "dot" format is particularly useful; see the graphviz package for tools to manipulate the output file.

When the dependency information specifies build-time dependencies, the edges of the graph represent a "build-before" order, and the vertices represent packages. Build-time cycles are troublesome, because something has to be built first. Depict can help packagers find ways of breaking cycles; i.e., it computes a set of edges whose removal transforms the graph into a directed acyclic graph (DAG) representing the initial, or bootstrap, build order. We associate costs with various types of edges; e.g., removing a documentation dependency should be cheaper than removing a build dependency.

The problem of finding the lowest-cost set of edges to remove in order to make the graph acyclic is known as the Minimum Feedback Arc Set problem; see https://en.wikipedia.org/wiki/Feedback_arc_set. The problem is NP-hard and APX-hard. For the small graphs that are typical for distribution packaging, we can probably find an optimal solution in a reasonable amount of time. The external solvers ppl and cvxopt can be used to find an optimal solution, or an approximation algorithm is available if they take too long.

The program comes with a set of plugins that know how to read dependency information from a variety of sources. Internally, the dependency information is represented as a JSON object structured as follows:

{
  "node-types": {
    "a unique name for this node type": {
      "color": "any string that dot interprets as a color",
      "nodes": [names of nodes of this type, all added to the graph]
      "weak-nodes": [names of nodes of this type, not added to the graph]
    },
    ...
  },
  "default-node-type": "node type assigned to nodes by default",
  "edge-types": {
    "a unique name for this edge type": {
      "color": "any string that dot interprets as a color",
      "style": "any string that dot interprets as a line style",
      "cost": numeric value representing the cost of removing this edge type
    },
    ...
  },
  "default-edge-type": "edge type assigned to edges by default",
  "subnodes": {
    "parent node": [other nodes built from the same source distribution],
    ...
  },
  "node-name-map": {
    "input-type": <input-type-specific-value>,
    ...
  },
  "edges": {
    "from-node": {
      "edge-type-1": ["to-node-1", "to-node-2", ...],
      "edge-type-2": ["to-node-3", "to-node-4", ...],
    },
    ...
  },
  "weak-edges": {
    "from-node": {
      "edge-type-1": ["to-node-1", "to-node-2", ...],
      "edge-type-2": ["to-node-3", "to-node-4", ...],
    },
    ...
   }
}

Various inputs can each provide part of the dependency information. This feature supports automatic generation of some of the input files from upstream sources, with supplementary hand-written values in other files as needed.

To use depict at all, you will need NetworkX, available from Fedora in the python3-networkx package. In addition, you may wish to install one or more of the following packages:

  • python3-hawkey and python3-dnf to use the dnf input method
  • python3-rpm and python3-dnf to use the rpm input method
  • python3-pyparsing to use the opam input method
  • python3-beautifulsoup4 and python3-requests to use the opam input method without an opam-repository git checkout (NOT recommended)
  • python3-pplpy to use the PPL solver
  • python3-cvxopt to use the CVXOPT solver

Author: Jerry James loganjerry@gmail.com http://www.jamezone.org/