#10684 Planned maintenance - coreos-cincinnati migration to OCP4 cluster - 2022-05-13 09:00 UTC
Closed: Fixed 2 years ago by lucab. Opened 2 years ago by lucab.

As a sub-task of https://pagure.io/fedora-infrastructure/issue/10631, we are now ready to migrate the coreos-cincinnati services to the new OCP4 cluster.

Maintenance will start at 2022-05-13 09:00 UTC and will last approximately two hours.
@mobrien and @lucab will be overseeing the migration.


The new service is already deployed on the OCP4 cluster: https://console-openshift-console.apps.ocp.fedoraproject.org/k8s/cluster/projects/coreos-cincinnati

There are 4 routes (2 service endpoints and 2 status endpoints) that needs to be switched from the old cluster to the new one.
This involves at least getting TLS certificates in place and updating DNS entries to steer traffic to the new cluster. And possibly more small papercuts on the way there.

We can start tackling the status endpoints first, to minimize the amount of user-facing downtime in case of unexpected issues.
Those are:
* status.raw-updates.coreos.fedoraproject.org
* status.updates.coreos.fedoraproject.org

Sanity checks for those are:
* curl https://status.raw-updates.coreos.fedoraproject.org/metrics
* curl https://status.updates.coreos.fedoraproject.org/metrics

After that we should be able to migrate the 2 remaining service endpoints, confidently without user-facing downtime.
Those are:
- raw-updates.coreos.fedoraproject.org
- updates.coreos.fedoraproject.org

Sanity checks for those are:
* curl -H 'Accept: application/json' 'https://raw-updates.coreos.fedoraproject.org/v1/graph?basearch=x86_64&stream=stable'
* curl -H 'Accept: application/json' 'https://updates.coreos.fedoraproject.org/v1/graph?basearch=x86_64&stream=stable&rollout_wariness=0'

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: low-trouble, medium-gain, ops, outage

2 years ago

Migration complete, all services up, thanks @mobrien for driving this!

As an extra goody, we also made sure that the staging-OCP4 services are in place and working.

Metadata Update from @lucab:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog