Issue #79: kempty-n11.ci.centos.org and kempty-n9.ci.centos.org seeds pods with wrong umask - centos-infra

centos-infra

#79 kempty-n11.ci.centos.org and kempty-n9.ci.centos.org seeds pods with wrong umask

Closed: Fixed 3 years ago by dkirwan. Opened 3 years ago by jlebon.

Looks like https://pagure.io/centos-infra/issue/48 is back, this time on kempty-n11.

Before wiping the machine and reprovisioning, let's try to figure out what is causing this and file RHBZs/touch base with the appropriate teams as needed. Feel free to reach out if you need help debugging!

Metadata Update from @dkirwan:
- Issue tagged with: centos-ci-infra, high-trouble, medium-gain

3 years ago

Metadata Update from @dkirwan:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

3 years ago

jlebon commented 3 years ago

This is affecting kempty-n9.ci.centos.org too now.

dkirwan commented 3 years ago

@jlebon can we book a block of time this week where you can explain in more detail what the problem is, how I can replicate it, and how its affecting your workloads? I'm wondering if its something caused by the elevated access your service accounts have, or if its a bug in RHCOS.

dkirwan commented 3 years ago

@jlebon and I met and attempted to replicate the issue on the n9-n11 nodes using the pod definition:

apiVersion: v1
metadata:
    name: coreos-assembler-sleep
kind: Pod
spec:
  nodeName: kempty-n10.ci.centos.org
  containers:
   - name: coreos-assembler-sleep
     image: quay.io/coreos-assembler/coreos-assembler:latest
     imagePullPolicy: Always
     workingDir: /srv/
     command: ['/usr/bin/dumb-init']
     args: ['sleep', 'infinity']
     resources:
       requests:
         cpu: "4"

Tested with my cluster-admin user, and the serviceaccount jenkins in project coreos-ci:

oc apply -f ~/testpod.yaml
oc get pods --watch
^C
oc exec -ti coreos-assembler-sleep /bin/bash
[builder@coreos-assembler-sleep srv]$ umask
0002
[builder@coreos-assembler-sleep srv]$ exit
command terminated with exit code 130

In each case it returned the expected umask 0002, so will mark this issue blocked, it is likely to reoccur in the coming days, we'll jump on a call and attempt to replicate the issue, capture any relevant information and report a bug upstream.

Metadata Update from @dkirwan:
- Issue assigned to dkirwan

3 years ago

Metadata Update from @dkirwan:
- Issue priority set to: None (was: Waiting on Assignee)

3 years ago

dkirwan commented 3 years ago

Hi @jlebon have you noticed this issue affecting you recently?

dkirwan commented 3 years ago

Will close, please reopen if we notice this issue reoccurring.

Metadata Update from @dkirwan:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Metadata

Assignee

dkirwan

Tags

Blocking

None

Depending on

None

Priority

None

Boards 1

CentOS CI Infra Status: Done

centos-infra

Source Code

#79 kempty-n11.ci.centos.org and kempty-n9.ci.centos.org seeds pods with wrong umask Closed: Fixed 3 years ago by dkirwan. Opened 3 years ago by jlebon.

Metadata

medium-gain centos-ci-infra high-trouble

Boards 1

#79 kempty-n11.ci.centos.org and kempty-n9.ci.centos.org seeds pods with wrong umask

Closed: Fixed 3 years ago by dkirwan. Opened 3 years ago by jlebon.