Issue #512: no openshift origin less than 3.11 for Fedora 29 Atomic Host - atomic-wg

atomic-wg

#512 no openshift origin less than 3.11 for Fedora 29 Atomic Host

Closed: Fixed 5 years ago Opened 5 years ago by dustymabe.

We've investigated the systemd Delegate=true API issue before 1 2. runc was fixed upstream but this fix only made it into openshift origin 3.11 (kube 1.11); keep in mind runc is vendored in openshift/kube so the version of the runc rpm on the system doesn't matter.

The API was brought back in F28 for us while we waited on upstreams. Now that we are moving to Fedora 29 anyone on openshift origin < 3.11 who want to upgrade/rebase to F29 will fail.

We must recommend users:

on f28AH upgrade to origin 3.11 or greater before rebasing to f29AH
deploying to f29AH have to use origin 3.11 or greater
stay on f28AH (no official releases!!) if they need 3.10 or older

Note that attempting to use cgroups v1 exclusively by setting systemd.unified_cgroup_hierarchy=0 and systemd.legacy_systemd_cgroup_controller=1 (man page) has no effect on this because it's an API that was withdrawn.

Thanks @sjenning for all the help and for providing relevant information.

dustymabe commented 5 years ago

In kubernetes it looks like this was applied in 5b5cd6f for the 1.10 branch and 54d9382 for the 1.11 branch.

[dustymabe@media]$ git tag --contains 5b5cd6f
v1.10.4
v1.10.5
v1.10.5-beta.0
v1.10.6
v1.10.6-beta.0
v1.10.7
v1.10.7-beta.0
v1.10.8
v1.10.8-beta.0
v1.10.9-beta.0
[dustymabe@media]$ git tag --contains 54d9382 
v1.11.0
v1.11.0-alpha.2
v1.11.0-beta.0
v1.11.0-beta.1
v1.11.0-beta.2
v1.11.0-rc.1
v1.11.0-rc.2
v1.11.0-rc.3
v1.11.1
v1.11.1-beta.0
v1.11.2
v1.11.2-beta.0
v1.11.3
v1.11.3-beta.0
v1.11.4-beta.0
v1.12.0
v1.12.0-alpha.0
v1.12.0-alpha.1
v1.12.0-beta.0
v1.12.0-beta.1
v1.12.0-beta.2
v1.12.0-rc.1
v1.12.0-rc.2
v1.12.1
v1.12.1-beta.0
v1.12.2-beta.0
v1.13.0-alpha.0

So for kubernetes anything v1.10.4 and above should be ok.

jasonbrooks commented 5 years ago

On a fedora atomic host 29 system, I used the workaround in https://pagure.io/atomic-wg/issue/452#comment-505248 to get an oc cluster up cluster working. I don't know what all the implications of changing the cgroupdriver from systemd to cgroupfs are. Safest bet for ppl with clusters below 3.11 would be to stay on f28 until they upgrade.

sinnykumari commented 5 years ago

To test, I created a 3-node cluster with origin 3.9 on Fedora 28 AH and applied cgroup-driver changes mentioned in issue and then rebased to Fedora 29 AH. After upgrade and reboot, I got all nodes in NotReady state.

sinnykumari commented 5 years ago

On a fedora atomic host 29 system, I used the workaround in https://pagure.io/atomic-wg/issue/452#comment-505248 to get an oc cluster up cluster working. I don't know what all the implications of changing the cgroupdriver from systemd to cgroupfs are. Safest bet for ppl with clusters below 3.11 would be to stay on f28 until they upgrade.

With a fresh F29 AH install, workaround shouldn't be required for running oc cluster up because we have origin-3.11.0-1.fc29 with required fix.

@jasonbrooks Did you try above upgrading from F28 AH (oc cluster up) to F29 AH?

jasonbrooks commented 5 years ago

@sinnykumari

There's one more bit to include for an ansible-based origin install:

vi /etc/origin/node/node-config.yaml

kubeletArguments:
...
cgroup-driver:
  - "cgroupfs"

I installed a three node cluster on fah 28, changed the cgroup driver in docker and in node-config.yaml, rebased to fah 29, and once the hosts came up, all three nodes were in Ready state.

sinnykumari commented 5 years ago

@sinnykumari
There's one more bit to include for an ansible-based origin install:
vi /etc/origin/node/node-config.yaml

kubeletArguments:
...
cgroup-driver:
- "cgroupfs"

Thanks @jasonbrooks ! With this additional change, Origin 3.9 cluster nodes look good after upgrading from F28 AH to F29 AH

Metadata Update from @sinnykumari:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

5 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Milestone

None

atomic-wg

Source Code

#512 no openshift origin less than 3.11 for Fedora 29 Atomic Host Closed: Fixed 5 years ago Opened 5 years ago by dustymabe.

Metadata

#512 no openshift origin less than 3.11 for Fedora 29 Atomic Host

Closed: Fixed 5 years ago Opened 5 years ago by dustymabe.