Looks like https://pagure.io/centos-infra/issue/48 is back, this time on kempty-n11.
Before wiping the machine and reprovisioning, let's try to figure out what is causing this and file RHBZs/touch base with the appropriate teams as needed. Feel free to reach out if you need help debugging!
Metadata Update from @dkirwan: - Issue tagged with: centos-ci-infra, high-trouble, medium-gain
Metadata Update from @dkirwan: - Issue priority set to: Waiting on Assignee (was: Needs Review)
This is affecting kempty-n9.ci.centos.org too now.
@jlebon can we book a block of time this week where you can explain in more detail what the problem is, how I can replicate it, and how its affecting your workloads? I'm wondering if its something caused by the elevated access your service accounts have, or if its a bug in RHCOS.
@jlebon and I met and attempted to replicate the issue on the n9-n11 nodes using the pod definition:
apiVersion: v1 metadata: name: coreos-assembler-sleep kind: Pod spec: nodeName: kempty-n10.ci.centos.org containers: - name: coreos-assembler-sleep image: quay.io/coreos-assembler/coreos-assembler:latest imagePullPolicy: Always workingDir: /srv/ command: ['/usr/bin/dumb-init'] args: ['sleep', 'infinity'] resources: requests: cpu: "4"
Tested with my cluster-admin user, and the serviceaccount jenkins in project coreos-ci:
jenkins
coreos-ci
oc apply -f ~/testpod.yaml oc get pods --watch ^C oc exec -ti coreos-assembler-sleep /bin/bash [builder@coreos-assembler-sleep srv]$ umask 0002 [builder@coreos-assembler-sleep srv]$ exit command terminated with exit code 130
In each case it returned the expected umask 0002, so will mark this issue blocked, it is likely to reoccur in the coming days, we'll jump on a call and attempt to replicate the issue, capture any relevant information and report a bug upstream.
0002
Metadata Update from @dkirwan: - Issue assigned to dkirwan
Metadata Update from @dkirwan: - Issue priority set to: None (was: Waiting on Assignee)
Hi @jlebon have you noticed this issue affecting you recently?
Will close, please reopen if we notice this issue reoccurring.
Metadata Update from @dkirwan: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.