Issue #6545: Atomic Host release process is not based on primary Atomic Host technology - releng

releng

#6545 Atomic Host release process is not based on primary Atomic Host technology

Closed: Fixed 7 years ago Opened 7 years ago by walters.

This is partially a migration of:

https://fedorahosted.org/rel-eng/ticket/6313

Now that we're in Fedora 25, all of the docs/media talk about Atomic Host releases in terms of media/images (ISO, qcow2, EC2 AMI etc). But that's not the primary release mechanism of Atomic Host. The primary mechanism is OSTree (as consumed by rpm-ostree).

I would expect that at a minimum, the release announcement contains commit hashes for OSTree, etc.

And all of the discussion about cadence of the tree commits inherits from the previous trac issue.

pbrobinson commented 7 years ago

So without reading through, and trac won't be there for ever, can we have an explicit summary of what needs to actually be done for this ticket to ensure details aren't lost.

Release announcements aren't part of rel-eng (although we do contribute to them) process. So "docs/media talk about Atomic Host releases" and "release announcement contains commit hashes for OSTree" should probably be bought up elsewhere and "etc" is open to interpretation. A bullet list of explicit requirements would be more useful than hand wavey statements.

walters commented 7 years ago

Look at what we've been doing in CentOS Devel:

https://wiki.centos.org/SpecialInterestGroup/Atomic/Devel

The media/images have the same version number as the ostree commit - they're derived from it. The alpha branch has a slow cadence, and hence has useful static deltas.

Baseline requirement suggestions:

The ostree commits should have slower cadence
Version numbers of the pungi run match the ostree commits

Next steps:

Static deltas are generated

Edited 7 years ago by walters

walters commented 7 years ago

Here's one example:

https://pagure.io/fedora-atomic/pull-request/30

If I commit this, as far as I know, it ships via ostree tomorrow, along with whatever came from Bodhi.

Whereas I think what we more want is (again following from the above ticket) a development ref that iterates as fast as possible (and that we have more control over), and changes like the language support would be batched, so we have interested testers try it, then at release we can write a useful changelog about the things that landed, etc.

dustymabe commented 7 years ago

I have got together with colin (some time ago) and patrick (recently)
to discuss an implementation for the first bulleted ask from the
comment above:
"ostree commits should have slower cadence". This proposal is for that work
item.

Currently what we have is ostree composes that run as part of (or
immediately after) bodhi runs that push out new updated rpms into the
updates or updates-testing yum repos in Fedora. As part of this a new
ostree commit is created with the new content and the
fedora-atomic/25/x86_64/docker-host ref within the ostree repo gets
updated.

This fedora-atomic/25/x86_64/docker-host ref is the one that our users
running atomic host are following. It means that when they run
rpm-ostree upgrade they are getting the latest commit from the last
bodhi run, not the commit from the last two week release.

We'd like to change this so that user's only get new commits ~every
two weeks (when we do a release). We can achieve this by making a
few changes:

change bodhi ostree composes to update a different "ref"
- we are proposing this ref should be called
  "fedora-atomic/25/x86_64/updates/docker-host"
  since it tracks the updates yum repo
- alternatively we already have
  fedora-atomic/25/x86_64/testing/docker-host
  which tracks the updates-testing yum repo.
  In the future we will change this name to
  "updates-testing" vs just "testing"
updating the two week release process to update the
fedora-atomic/25/x86_64/docker-host ref
- This means that ref will only get updated when we do a release.
building the iso/cloud images from the "updates" ref
but pointing them to the 2wk release ref
- This will mean we can still get new images every night to test
  but when we release one of these images it tracks the 2wk ref
  by default.

dustymabe commented 7 years ago

Patrick and I will plan on making these changes later this week unless
someone objects before that time.

walters commented 7 years ago

Sounds good to me.

dustymabe commented 7 years ago

Patrick will be making this changes detailed in https://pagure.io/releng/issue/6545#comment-67347 today.

jlebon commented 7 years ago

updating the two week release process to update the fedora-atomic/25/x86_64/docker-host ref

How will versioning work for this? AFAIU right now the nightly compose just uses the auto-increment feature. Will the stable releases jump numbers (i.e. +14 on every upgrade)? Though it's probably too late for F25, it might be better to use timestamp based versions, e.g. 25.20170207.

building the iso/cloud images from the "updates" ref but pointing them to the 2wk release ref

This will mean we can still get new images every night to test but when we release one of these images it tracks the 2wk ref by default.

I had a chat about this yesterday with @dustymabe and @puiterwijk. One of the proposed plan of action was to just use ostree admin set-origin to make sure the nightlies always point to the stable repo.

There's a tricky bit here though. We want the new 2wk commits to be children of the previous commits. But the commits from the nightlies will be on a different branch. When you e.g. do ostree commit -b $2wk_ref --tree=ref=$nightly_ref to promote content between branches, you will get a different SHA. So the commits on which the nightly images are built will never actually belong on the 2wk branch (and e.g. ostree log will be going up the wrong ancestry).

It would be better to do to this in separate steps. I.e. promote content to the 2wk branch, and then create an image based on that ref. No set-origin needed. Alternatively, you can also make sure that the nightly composes always commit on top of the the latest 2wk commit, rather than the previous nightly commit.

dustymabe commented 7 years ago

How will versioning work for this? AFAIU right now the nightly compose just uses the auto-increment feature. Will the stable releases jump numbers (i.e. +14 on every upgrade)? Though it's probably too late for F25, it might be better to use timestamp based versions, e.g. 25.20170207.

Right. I believe the 14+ bump in version is fine for now. I do like the timestamp based versioning, maybe we can do that for F26.

I had a chat about this yesterday with @dustymabe and @puiterwijk. One of the proposed plan of action was to just use ostree admin set-origin to make sure the nightlies always point to the stable repo.
There's a tricky bit here though. We want the new 2wk commits to be children of the previous commits. But the commits from the nightlies will be on a different branch. When you e.g. do ostree commit -b $2wk_ref --tree=ref=$nightly_ref to promote content between branches, you will get a different SHA. So the commits on which the nightly images are built will never actually belong on the 2wk branch (and e.g. ostree log will be going up the wrong ancestry).

actually the proposal was just to update the "ref" file and not run ostree commit. I didn't really know about the "summary" file before yesterday so maybe with this approach we would need to re-run ostree summary -u again?

It would be better to do to this in separate steps. I.e. promote content to the 2wk branch, and then create an image based on that ref. No set-origin needed. Alternatively, you can also make sure that the nightly composes always commit on top of the the latest 2wk commit, rather than the previous nightly commit.

walters commented 7 years ago

See https://github.com/cgwalters/ostree-releng-scripts/blob/master/do-release-tags for the tool I wrote for CAHC. I think it would likely make sense to reuse here - but I didn't suggest it originally because simply doing anything at all that approximates slowing the cadence would be a massive improvement.

As far as the images/media; yeah, I think there'd also be one media/images generated for each stream; however they would both point at stable for updates.

walters commented 7 years ago

I'd probably say the lowest risk thing here is to stand up a separate repository and start writing the integration/code that operates on it, e.g. if we we choose to use do-release-tags (and I think we should).
One thing that's really unclear to me is - what process would this be? It's not pungi or fedmsg-atomic-composer, is it? Conceptually it's like a new jenkins job, but we don't really have such a concept AFAIK.

dustymabe commented 7 years ago

had a long chat with jonathan - there are a lot of things that can be improved upon but the current design as detailed in https://pagure.io/releng/issue/6545#comment-67347 is mainly for getting a lot of the benefits without changing releng's existing flow too much (partly because we are in the middle of F25).

We are marching ahead with https://pagure.io/releng/issue/6545#comment-67347 for now and we can make more incremental improvements after that and/or larger improvements for F26.

walters commented 7 years ago

I'd say it's worth at least evaluating the do-release-tags approach - understand the code and what it does, and how it differs from the ref retagging model.

walters commented 7 years ago

Basically, do-release-tags will give us different commit history. The main goal of that is avoiding the noise from the devel stream. Most production users won't care about all of the intermediate development commits we make.

As we move to speed up development (i.e. more than once a day, like we do in CAHC), the "retagging" approach means the commit history on the release branch will get noisier. Whereas with do-release-tags, we don't have that problem.

(However, the flip side is that the commit hashes are different, and while one can look up one from the other, it's something that one has to understand )

dustymabe commented 7 years ago

Basically, do-release-tags will give us different commit history. The main goal of that is avoiding the noise from the devel stream. Most production users won't care about all of the intermediate development commits we make.
As we move to speed up development (i.e. more than once a day, like we do in CAHC), the "retagging" approach means the commit history on the release branch will get noisier. Whereas with do-release-tags, we don't have that problem.

I agree we don't want noisy history long term so we can work on this for next release (f26) maybe. I will point out that until we make it easier for our users to browse the history of a repo then it doesn't really matter that it is noisy.

Edited 7 years ago by dustymabe

walters commented 7 years ago

First, for browsing I think the biggest win would be a website to visualize the repo history. I started writing one once. I may look at that.

Clean repo history also helps the deploy command be both more efficient and more useful.

Anyways, I'm not opposed to retagging. Do we have the ability to have different processes for 25/26 today?

dustymabe commented 7 years ago

Anyways, I'm not opposed to retagging. Do we have the ability to have different processes for 25/26 today?

I'll be investigating f26 before too long so I'll be able to get back to you with an answer for this.

walters commented 7 years ago

Oh, now I remember the main reason I switched CAHC over to do-release-tags - it was making static delta management easier. We want static deltas between the last release commit to the current. If one does a "retag", the information "what was the previous release commit" isn't stored in the repo. Which means a command like ostree static-delta generate fedora-atomic/25/x86_64/docker-host will do the wrong thing.

Now obviously, we could keep this information out-of-band (JSON file, database, or a fedora-atomic/25/x86_64/previous-stable/docker-host ref), and explicitly do ostree static-delta generate --from PREVIOUS --to fedora-atomic/25/x86_64/docker-host, but it's messier. And static deltas won't be the only tool that wants to know what the previous stable commit was. Hence I'd still push for do-release-tags.

dustymabe commented 7 years ago

A few open pull requests for the work related to the changes we have discussed:

https://pagure.io/fedora-kickstarts/pull-request/130
https://pagure.io/pungi-fedora/pull-request/129
https://pagure.io/fedora-lorax-templates/pull-request/10

puiterwijk commented 7 years ago

Regarding do-release-tags: I will update the 2wk atomic scripts to start using that for promotion etc.

puiterwijk commented 7 years ago

Actually, Dusty reminded me that that is going to be tricky because we already have images with the original to-promote commit ID. I'll let that discussion up to Colin and Dusty.

walters commented 7 years ago

Yeah, having to regenerate images just to get a new commit ID is a bit annoying. Down the line, I think we should go to automatic in place updates. Once we do that across the board, the cloud images are more of an asynchronous starting point - in other words, we don't necessarily respin the AMIs/qcow2 etc. for each release.

This becomes more of a polar opposite from the current "image-focused, ostree is a lookaside" approach. Now, I'm well aware that there is a large contingent of public cloud users who basically just want updated AMIs (and/or respin their own), and don't care about in place updates (whether ostree or not).

Going down a rabbit hole a bit, one thing we could conceive of is moving away from the ostree commit history. This would get more natural if we used Docker/OCI for transport. Really, the only thing causing the commit IDs to differ is the parent history (and timestamp, but we control that). There'd be a lot of tradeoffs in this...but it's a conceptual question as to how useful the ostree history really is.

dustymabe commented 7 years ago

The final piece of this puzzle is in place: https://pagure.io/releng/pull-request/6627

We'll try to release a new 2WeekAtomic release tomorrow with the new workflow in place.

dustymabe commented 7 years ago

OK the two week release we did last week has implemented this as detailed in this comment. Note that this is only half of this ticket as originally written. We still need to release images that somehow reflect the version of the OSTree that is baked in. I have opened a new ticket against the atomic-wg to figure out what we want this to look like and then I'll come back to releng with a request for collaboration on the possible solution. I'll open a new ticket for that work when we are ready. Closing this ticket for now.

Edited 7 years ago by dustymabe

puiterwijk commented 7 years ago

dustymabe | puiterwijk: can you close this ticket and mark it as fixed.

Absolutely.
So, as of now, we consider this ticket closed.

Metadata Update from @puiterwijk:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

7 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Milestone

None

duplicate

None

blockedby

None

blocking

None

releng

Source Code

Documentation

#6545 Atomic Host release process is not based on primary Atomic Host technology Closed: Fixed 7 years ago Opened 7 years ago by walters.

Metadata

#6545 Atomic Host release process is not based on primary Atomic Host technology

Closed: Fixed 7 years ago Opened 7 years ago by walters.