PR#43 Document naming policy.

Proposed 2 months ago by dmach
Modified 20 days ago
From forks/dmach/modularity master  into modularity master

@@ -7,6 +7,7 @@ 

      :maxdepth: 1

  

  

+     building-modules/naming-policy

      building-modules/developing

      building-modules/module-guidelines

      building-modules/building-local

@@ -0,0 +1,112 @@ 

+ =============

+ Naming Policy

+ =============

+ This document defines naming policy for modulemd metadata of final (built) modules.

+ This policy does **NOT** apply on sources such as modulemd yaml in dist-git.

+ 

+ The goal is to provide unique identifiers for modules

+ that are both human readable and also suitable for machine processing.

+ 

+ 

+ Fields

+ ------

+ 

+ * **N** - Name

+ * **S** - Stream

+ * **V** - Version

+ * **C** - Context

+ * **A** - Arch

+ * **P** - Profile

+ 

+ 

+ Separators

+ ----------

+ | Fields are separated with ':' (colon): N:S:V:C:A.

+ | If P is specified, it's separated from N:S:V:C:A with '/' (forward slash): N:S:V:C:A/P.

+ 

+ Examples::

+ 

+     # N:S:V:C:A

+     base-runtime:fedora-26:1:0123abcd:x86_64

+ 

+     # N:S:V:C:A/P

+     base-runtime:fedora-26:1:0123abcd:x86_64/minimal

+ 

+ 

+ Forms

+ -----

+ A form is a sequence of fields that fully or partially identifies a module.

+ 

+ 

+ Full Forms

+ ~~~~~~~~~~

+ 

+ N:S:V:C:A

+     Unique identifier of a module.

+ N:S:V:C:A/P

+     Unique identifier of a module profile.

+ 

+ 

+ Partial Forms

+ ~~~~~~~~~~~~~

+ 

+ Supported partial forms are: N [ : S [ :V [ :C ] ] ] [ :A ] [ /P ]

+ 

+ Namely:

+ 

+ * N

+ * N::A

+ * N:S

+ * N:S::A

+ * N:S:V

+ * N:S:V::A

+ * N:S:V:C

+ * N:S:V:C:A (identical to N:S:V:C::A)

+ * and all combinations with /P

+ 

+ Missing fields **SHOULD** be populated with recommended defaults:

+ 

+ Stream

+     defaults to the enabled or system default stream for the module in this particular order

+ Version

+     defaults to the latest available version in the module stream

+ Context

+     defaults to a value matching with already installed modules or modules involved in the transaction (not yet installed)

+ Arch

+     defaults to the system arch (e.g. DNF's $basearch)

+ Profile

+     defaults to the system default or 'default' profile

+ 

+ 

+ Allowed Characters

+ ------------------

+ **N** - Name

+     a-z A-Z 0-9 . - _ +

+ **S** - Stream

+     a-z A-Z 0-9 . - _ +

+ **V** - Version

+     0-9

+ **C** - Context

+     0-9 a-f

+ **A** - Arch

+     a-z A-Z 0-9 . - _ +

+ **P** - Profile

+     a-z A-Z 0-9 . - _ +

+ 

+ All fields **MUST** start and end with an alphanumeric character:

+ a-z A-Z 0-9

+ 

+ 

+ Forbidden Characters

+ --------------------

+ This paragraph serves as a design decision for future changes.

+ 

+ Following characters **MUST NOT** be part of any field:

+ 

+ * ':' (colon) - separator

+ * '/' (forward slash) - profile separator

+ * '\\\\' (backslash) - comon control character

+ * '*' (asterisk) - common wildcard

+ * '?' (question mark) - common wildcard

+ * '@' (at) - grpspec in YUM and DNF

+ * ' ' (space) - common separator
ignatenkobrain commented on line 57 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I still don't see how this is useful and where it has been agreed on this. I still think that having *same* separator is wrong
ignatenkobrain commented on line 63 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
since version is optional, what `base-runtime:26` would mean? is `26` stream or version? I think this needs to be described very carefully which fields are optional and obviously change separator.
ignatenkobrain commented on line 109 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
why does it matter how files are names? what matters is metadata inside.
ignatenkobrain commented on line 121 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
obviously missing details here are: * ascending or descending * in which locale * natsort? * how to sort versions?
ignatenkobrain commented on line 43 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I think specifying version without specifying stream is terribly wrong.
ignatenkobrain commented on line 80 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
why profile can't contain `_` or `.`?
ignatenkobrain commented on line 36 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
so since I think this is wrong, my proposal here is Docker/pip style: `N[:S[/V]][.A][+P]`
nphilipp commented on line 43 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
…except maybe if a module/stream is installed already and you want to update/downgrade while staying **in the same stream**
nphilipp commented on line 109 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I think that's mostly to acommodate users encountering these files. Technically, RPM packages could be named anything, too, but the convention is N-V-R.A.rpm. IMO, the criterion could be downgraded from **MUST** to **SHOULD**.
nphilipp commented on line 121 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
* yeah, "ascending" probably * what do you mean with "locale"? Language, encoding? * module versions are mere integers, so I don't get what the problem with sorting them is

For full identification of a module - all fields are mandatory (N:S:V:A).
Partial form is used by tools (namely DNF) to make a query, return best matching module and work with it.

I don't understand where's the problem.
It has been working with YUM/DNF and RPMs for ages,
this proposal is even stricter on delimiters and makes parsing easier.

jkaluza commented on line 103 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
This will be very hard to follow in dist-git. You want to name your modulemd as `N.yaml` (N states for "name"). If this is talking about DNF specific yaml files cached locally, I think it should be removed from the doc and moved to different one.
jkaluza commented on line 121 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
To have diffable output, the sorting of all fields in modulemd.dumps() output has to be the same. Should we mention it in this document too? Note that I have no idea if modulemd behaves this way or not...
nphilipp commented on line 103 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Agreed. I don't think this part means anything else than the files DNF is operating on, but it doesn't make it clear.
nphilipp commented on line 121 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
@jkaluza, I don't think there are any promises about preserving the structure of a modulemd document, neither with the current Python implementation nor with its slated C/glib/gobject based successor. I've talked to @psabata and @sgallagh and the main technical reason (leaving aside the discussion if it's necessary or not :stuck_out_tongue:) seems to be that the hash data structures available in glib/gobject don't have the functionality of an `OrderedDict` in Python. We've discussed that an acceptable place to keep things in order is the Python wrapper around the gobject-introspected interface, i.e. Python would load the YAML document in a way that preserves order (e.g. using the `pyaml` instead of the `PyYAML` module), parse things up, use `modulemd` to "do things" on the structure, and then coerce things back into the old structure before writing out to disk. Or something like that :smile:.
james commented on line 46 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Why can't we default to whichever stream matches (if only one). I'm not sure we want to match to the default stream if another also matches, but I also don't understand why Igor doesn't want name:version matching anyway ... maybe we can talk about that tomorrow?
james commented on line 57 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
We did vote on this last week and non-same separator was a clear winner, no? Is this strawman to talk about tomorrow. Would be kind of useful to put some other text in here for what people voted on for comparison? For ref. the votes were (roughly): All dash: -4 All colons: +3 -3 '@' and dot (p.n@s.v.a/P): -2 Partial colon (p.n:s.v.a/P): -4 Current (n-s-v/P): +8 -2 PIP Style (N[:S[/V]][.A][+P]): +17 -0
james commented on line 80 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I'd go the other way and say we should drop '.' and '_' from name/stream, unless we have known uses for either. Expanding "allowed" characters is easy, later.
james commented on line 28 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Note that we should at least mention the stuff happening with platform/factory where we'll need to add a hash/checksum somewhere here. It's not 100% finalized yet, but still it will be confusing to talk about both things and not reference the other.
psabata commented on line 17 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
We've basically agreed to include a new field called "context", abbreviated with C. While we think this field shouldn't be displayed to the user in most situations, it might be specified in some other and will be necessary to truly uniquely identify a module build. Therefore I would like you to include it in this proposal. Its presence should about as rare as that of the base architecture.
psabata commented on line 28 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Just for reference, James' comment here is about the context field noted above.
psabata commented on line 43 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
As noted in some other comments, some combinations of missing fields make more sense than others. In my opinion, the following should be considered as valid input. I'm omitting the profile since it should be possible to add profile to any and all of the combinations below. Examples provided in parens. I will use colons as a separator even though I don't agree with that choice -- and the examples below will just emphasize the weakness of this solution. - N (`httpd`) - NA (`httpd:i686`) - NS (`httpd:2.4`) - NSA (`httpd:2.4:i686`) - NSV (`httpd:2.4:20170808123000`) - NSVA (`httpd:2.4:20170808123000:i686`) - NSVC (`httpd:2.4:20170808123000:deadbeef`) - NSVCA (`httpd:2.4:20170808123000:deadbeef:i686`) Stream is effectively a part of the structured name. It doesn't make sense to specify version or context without it. Relying on default stream while being extremely specific at the same time just feels weird and an unlikely use case. Context serves as a unique flag (hash) identifying the compatibility context of the module's artifacts and most people shouldn't need it -- it is still useful for filing bug reports or cloning a system, however.
psabata commented on line 46 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I don't understand what "defaults to enabled" means here. Defaulting to the one defined by system profiles (or other temporary configuration) if no stream is specified is reasonable and I think it's desired. See my comment above. Also, as noted above, since stream is effectively part of the (structured) name, specifying just NV doesn't make much sense to me. Again, reasons above.
psabata commented on line 52 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Besides the configured default, this should default to `default` unless specified or configured otherwise. Furthermore if the profile doesn't exist, consider it being empty. Empty profiles are fine. I repeat -- empty profiles are fine.
psabata commented on line 57 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I'm a proponent of the PIP style as well. While the DNF team owns this problem, I don't see a reason to completely ignore the popular vote, especially since the proposed solution here (just colons) relies on dark magic and heuristics inside DNF rather than being deterministic. More trouble would arise once somebody names their stream `x86_64` or something similar. You then need more policies to prevent that, in multiple places. Several tools that handle modules need to implement exactly the same logic as DNF... it just doesn't make sense to me. Please, seriously reconsider going with the PIP style, even though it might seem complex at first. Also realize the most common use case for the average Joe will be just the following four: N, N+P, N:S, N:S+P. The only thing that needs to be added to this option is the context hash value. Extending the proposal with "%" below ("%" is known as hash to certain programmers): `N [ :S [ /V [ %C ] ] ] [ .A ] [ +P ]` This proposal also clearly maps to my comment for line 43 above.
psabata commented on line 72 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
"." would need to be excluded if we go with the PIP style.
psabata commented on line 74 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
"." would need to be excluded if we go with the PIP-style.
psabata commented on line 78 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Any reason to make this more restrictive than the rest? You could argue we don't have such symbols in architectures today. Well, we don't have architectures with dashes either.
psabata commented on line 80 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Well, in this case neither "_" nor "." conflict with anything or any delimeter, even if we go with the PIP style. But "." in name and stream is problematic for the PIP-style, as I mentioned above.
psabata commented on line 83 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
This is in conflict with some of the definitions above that do not include uppercase letters (for some reason).
psabata commented on line 98 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Once again, voting for the PIP-style here ;) I would extend this list to all the separators proposed -- :, /, %, . and +.
psabata commented on line 103 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
This cannot be a general rule for all the systems that handle modulemd. Besides there being no reason for that, it's not even possible to implement, as others have noted. If this is about how DNF stores module metadata on the system, I don't think it belongs in this document.
psabata commented on line 121 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
This is an interesting idea but I don't think it belongs in this document. This should be an RFE for modulemd.

So I added several comments, many of which I feel fairly strongly about.

Note the modulemd specification is going to be updated within the next couple of days. We will be adding the context hash, base architecture and EOL tags. This is all pretty simple but two of these have an immediate effect on this document. More in the comments.

And just to ensure it isn't lost, I'm proposing this extended PIP-style notation that doesn't require any heuristics and policies all over the place:

N [ :S [ /V [ %C ] ] ] [ .A ] [ +P ]

The only difference from the popular winner is the addition of the context hash.

otaylor commented on line 41 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
I'd like to propose a requirement: given a specification of a module, it must be possible to figure out what name/stream/version/etc. of the module has been specified based *solely* on on the text of the specification. [If you have to make make http requests to PDC to parse a name into fields, that's not acceptable.] The text here seems to contradict that, though it may simply be saying that it should be possible to "partially" specify a module.
psabata commented on line 41 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
Owen, I believe this is about selecting a module without specifying the entire identifier, just like `dnf install perl` and `dnf install perl-5.24.2-393.fc26.x86_64` install the same thing (assuming that NVRA is the latest, of course). The tooling should fill in the unspecified fields with defaults -- the default configured stream, the latest version, the matching context, the matching architecture and the default configured profile.
otaylor commented on line 17 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
In the context of naming, what worries me about the opaque hash for a context is that you can either a) not specify the context and assume that the backend processing will "do what I mean" or b) do a look of the available contexts and pick one for the name. In the context of an existing set of enabled modules the DWIM is reasonably obvious - "pick the context that is compatible with the set of modules that are already enabled" (and hope there is exactly one) - but say you are specifying a module to build a Dockerfile against and the module is available in multiple contexts. What's the intent? Is it the module with the "latest" version of the dependencies, sorting the streams in some fashion? This makes me wonder if it should be possible to specify context instead as (Platform=f27) or (Platform=f26,Python=3.7). This obviously takes the PIP naming scheme to new heights of complexity, but without it, I worry that different parts of our infrastructure will invent their own way of matching against available contexts.
psabata commented on line 17 of source/development/building-modules/naming-policy.rst.
2 months ago
(Hide)
One easy way to specify context in one invocation is to list multiple modules in that transaction. I will skip `platform` in the example below because it is always installed for you -- or you base your container on it. But let's say your application `foo` is built against `python:3.6` and `python:3.7`. If you asked dnf to install `foo` (with the default stream and such), it wouldn't know which `python` to choose. Now you have two main options. You either pick the default `python` stream selected by system configuration -- the system profiles. Or you explicitly ask the user to clarify what they want, saying that the input wasn't sufficient to select the software (some distros do this and it's not so bad). Or, to force it in one go, the user could run something like `dnf install @python:3.7 @foo`. In this case, the proper context of both modules would be selected -- the `python:3.7` in the latest version compatible with the `platform` you have, and `foo` in the default stream, the latest version and context compatible with your `platform` and `python:3.7`.

Dan, Adam and I discussed this in person today. We agreed Dan will be submitting an updated proposal with the string form using the following format:

N [ : S [ :V [ :C ] ] ] [ :A ] [ /P ]

To avoid ambiguity, specifying architecture with a partial form, one needs to use an IPv6-like syntax. Several examples of valid strings can be found below:

  • httpd
  • httpd:2.4
  • httpd:2.4::i686
  • httpd::ppc64le
  • httpd:2.4:20170809172200:deadbeef:s390x/minimal

The above implies selectors that specify only version or context (or both) but no stream are invalid. While it could be technically possible, just like it is in IPv6 addresses, it was deemed unnecessary and confusing. If there's a demand for that, we can add it later.

Edited 2 months ago by psabata

Thank you all for the feedback.
As Petr posted already, we met today to work on a new version
of this document and incorporate your thoughts.

First of all, we decided to avoid PIP notation,
because it's overly complex and requires users
to remember all separators.
Having one separator for fields and an additional
one for profile is more reasonable.
Keeping it simple is also a requirement from our management...

Changes from previous version:

Removed:
* file names (document scope)
* metadata ordering (document scope)

Added:
* C (Context) field - based on new modulemd specs

Changed:
* used IPv6 style syntax for ::A
* unified allowed characters across most fields

I'm hoping you'll find this version better,
but if you still have any suggestions, use either the comments in the "code"
or create one big comment for better readability of more complex suggestions.

2 months ago

rebased

psabata commented on line 75 of source/development/building-modules/naming-policy.rst.
a month ago
(Hide)
Maybe this could be expanded a little bit -- perhaps mention modules involved in the transaction (not yet installed) affect the choice too?

@psabata good point, I've updated the text.

a month ago

rebased

psabata commented on line 79 of source/development/building-modules/naming-policy.rst.
a month ago
(Hide)
...or `default` if no configuration is present. Also the configuration (system profiles) should be per module, I think.
psabata commented on line 82 of source/development/building-modules/naming-policy.rst.
a month ago
(Hide)
All the string fields should also allow `+`. We allow it in RPMs as well (`gtk+` for instance).
a month ago

rebased

rebased with all requested changes

unless there are any additional change requests,
can anyone merge this PR?

+1

Let's push this forward.