#920 outage : openshift CI
Closed: Fixed 6 months ago by arrfab. Opened 6 months ago by arrfab.

We have the openshift/ocp ci cluster (https://console-openshift-console.apps.ocp.ci.centos.org) in a non-functional state (see https://lists.centos.org/pipermail/ci-users/2022-September/004610.html)
We got confirmation from CI tenants that it's stuck for them too and BuildConfigs/ImageStreams is broken at the moment and we have the cluster stuck in updating mode

Metadata Update from @arrfab:
- Issue tagged with: centos-ci-infra, high-gain, high-trouble

6 months ago

Metadata Update from @arrfab:
- Issue assigned to dkirwan

6 months ago

Thanks to @dkirwan the cluster is now back online and we also identify a HDD issue on one worker node.
I'll spin another node to join that cluster to spread the load.
We also identified another issue wrt account/subscription but that will be done next week on monday (as cluster is now working so let's first resume ci workloads)

Added back a freshly reinstalled worker node :

dumpty-n5.ci.centos.org    Ready    worker   37s     v1.23.5+012e945

current status :

  • cluster ownership properly transferred to different account (easier for subscriptions)
  • cluster updated with new pull-secret to version 4.10.33
  • node kempty-n9.ci.centos.org freshly reinstalled and added back as worker node

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 months ago

Login to comment on this ticket.

Boards 1
CentOS CI Infra Status: Backlog