Issue #118: Can't rollout existing deploymentconfigs, can't start new builds - centos-infra

centos-infra

#118 Can't rollout existing deploymentconfigs, can't start new builds

Closed: Fixed 3 years ago by jlebon. Opened 3 years ago by jlebon.

When I try to start a new build, it's just stays stuck like this:

Cloning "https://github.com/coreos/fedora-coreos-pipeline" ...
        Commit: f592117e9b08eb986ee0f02ac843ea5256d2b9ef (Merge pull request #282 from bgilbert/legacy)
        Author: Benjamin Gilbert <bgilbert@redhat.com>
        Date:   Wed Sep 30 19:27:35 2020 -0400
Caching blobs under "/var/cache/blobs".
Getting image source signatures
Copying blob sha256:c9fa7d57b9028d4bd02b51cef3c3039fa7b23a8b2d9d26a6ce66b3428f6e2457
Copying blob sha256:74cbb6607642df5f9f70e8588e3c56d6de795d1a9af22866ea4cc82f2dad4f14
Copying blob sha256:4406a9beeab462c05c8e03f0c6eccd2902cb4cde00caee21cf13ff6dfaab53f4
Copying blob sha256:4908e3220585a526b87e77f88ee7ddd06c502447269792ea4013e1b2f414f41e
Copying blob sha256:356d6834e7fe34a7a4a445ab83007138dd44c135ad9fb7f55ebd913add7884ba
Copying blob sha256:9ebdc795180d997ec9611acb8a453717f63fa2227b003db6c7cc911482e3e43d

When I try to deploy a deploymentconfig, the pod gets image pull errors even though it's an internal image that hasn't changed in a while:

  Normal   Pulling         25s   kubelet, kempty-n9.ci.centos.org  Pulling image "image-registry.openshift-image-registry.svc:5000/coreos-ci/jenkins@sha256:319ce8a5ff6e063c5a3fc0d93539773a22d2bc36d41bd5f70d27f48c56a7f3da"
  Warning  Failed          5s    kubelet, kempty-n9.ci.centos.org  Failed to pull image "image-registry.openshift-image-registry.svc:5000/coreos-ci/jenkins@sha256:319ce8a5ff6e063c5a3fc0d93539773a22d2bc36d41bd5f70d27f48c56a7f3da": rpc error: code = Unknown desc = Error writing blob: error storing blob to file "/var/tmp/storage143562693/1": flate: corrupt input before offset 50341571
  Warning  Failed          5s    kubelet, kempty-n9.ci.centos.org  Error: ErrImagePull
  Normal   BackOff         4s    kubelet, kempty-n9.ci.centos.org  Back-off pulling image "image-registry.openshift-image-registry.svc:5000/coreos-ci/jenkins@sha256:319ce8a5ff6e063c5a3fc0d93539773a22d2bc36d41bd5f70d27f48c56a7f3da"
  Warning  Failed          4s    kubelet, kempty-n9.ci.centos.org  Error: ImagePullBackOff

$ oc get is jenkins
NAME      IMAGE REPOSITORY                                                     TAGS   UPDATED
jenkins   image-registry.openshift-image-registry.svc:5000/coreos-ci/jenkins   2      2 months ago

I'm guessing this is related to the NFS issues? Is the OpenShift cluster itself backed by NFS somehow?

arrfab commented 3 years ago

see the message to ci-users list today about nfs server crash and probably better to discuss interactively in #centos-ci on freenode (where we are discussing it right now)

jlebon commented 3 years ago

I was trying to work around the NFS issues by using an emptyDir for Jenkins, but seems like basic OpenShift functionality is broken now. This is somewhat urgent for us, because it's blocking a lot of our upstream CI PR testing.

siddharthvipul1 commented 3 years ago

@jlebon things on OCP4 cluster seem to be working fine now, can you confirm if you are still facing this

jlebon commented 3 years ago

Yeah, this specific corruption error seems like it went away now.

Metadata Update from @jlebon:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Priority

Needs Review

centos-infra

Source Code

#118 Can't rollout existing deploymentconfigs, can't start new builds Closed: Fixed 3 years ago by jlebon. Opened 3 years ago by jlebon.

Metadata

#118 Can't rollout existing deploymentconfigs, can't start new builds

Closed: Fixed 3 years ago by jlebon. Opened 3 years ago by jlebon.