#525 Enforce unique deliverable identifiers
Closed: Fixed 3 years ago Opened 5 years ago by adamwill.

So I've been using this as a formula to 'uniquely' identify any given image within a compose for now:

subvariant + imagetype + arch

however, for now this is just something I came up with, and right now it's possible for it to stop being a unique identifier; it's happened at least once, with the atomic installer images, and I had to file all those issues to get their imagetype changed.

It occurs to me that we could, if desired, prevent this scenario happening again, by deciding on a canonical 'unique image identifier' formula and simply enforcing it in the code. As one of the checks during the compose process, pungi could work out if any images in the compose would be 'dupes', and simply fail the compose if it happened.

WDYT? I think this is a need that tends to keep cropping up - the ability to define "the same" image across multiple composes - and it'd be good to have a convention for it, and pungi (or I guess possibly productmd) seems like a logical place to do this.

Rather than the identifier I've been using so far, we could use:

subvariant + imagetype + format + arch

as I think it's a bit stronger. If anyone can think of another field that would make sense in there, we can consider it. I don't think variant works, because sometimes (at least for Fedora) variant and subvariant are identical, and we tend to just roll the variant name into the subvariant when they differ (e.g. variant 'Cloud', subvariant 'Cloud_Base' - not variant 'Cloud', subvariant 'Base').

Oh yes, I fully agree that the images metadata needs some improvement. The uniqueness should be enforced in productmd in my opinion.

At least disc_number should be added to the identifier. For most cases it will be 1, but Pungi can generate ISO files with repositories and split them to fit on the actual media. Fedora does not use this feature currently (and I don't think there is any plan to start with it).

The default for subvariant is to copy variant.

Ultimately I think we should make sure the type is descriptive enough to identify what the content in the image is. There are a few upstream issues about this already.

disc_number is a good point (and since I'm kinda anal and care about old releases, it is relevant to Fedora, as we have old releases with multiple discs like that).

I guess you don't remember, but we use subvariant in Fedora extensively to differentiate between images, it's probably the most important single property - because we have for instance the Spins variant which creates a whole bunch of different live images, so we use subvariant to distinguish between them. If subvariant for RHEL stuff is usually a copy of variant I think that should work fine; the only problematic case we have is if someone has a config where the info from subvariant alone is not sufficient and variant must be included too. It's not the end of the world even then, we'd just have some ugly duplication in the Fedora identifiers, but we could live with that.

BTW, I'm in Brno today and tomorrow, so we can meet up if you want to chat and see some of our use cases :) I'm on floor 4 of Brno 1 with the rest of Fedora QA.

That was exactly the point I was trying to make; that subvariant is sufficient on its own. Either people don't use it and it gets a reasonable default, or they use it and should make sure it's sensible.

sure, sounds good. i was just worried there might be existing cases that didn't meet expectations.

I sent a PR to add some bits for this to productmd: https://github.com/release-engineering/productmd/pull/76 . Let me know what you think. BTW, one thing I noticed is that you really can't easily use variant in an image identifier instead of subvariant; I was going to have pre-1.1 metadata fall back to using variant, but because the image itself has no idea what its variant is, this doesn't really fit in the model. So if you're not using subvariant, you effectively only really get format, type, arch and disc number, which isn't a lot of use. I can't see any good way around this, though. FWIW, in fedfind, one of the things I do when 'enhancing' the image dicts is stuff the variant directly into the dict for convenience.

The productmd changes are merged and used for a long time. For cleanup I'm going to close this issue.

Metadata Update from @lsedlar:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.