Describe what you need us to do: I need to delete pod anitya-1547485200-6hstr on production openshift in release-monitoring.org project. It is pulling wrong image. I did a change in the cron.yml to fix this, but it's still using the same pod instead of starting a new one. After deletion of mentioned pod the job should be restarted automatically.
anitya-1547485200-6hstr
When do you need this? (YYYY/MM/DD) As soon as possible
When is this no longer needed or useful? (YYYY/MM/DD) When cron job will no longer be used.
If we cannot complete your request, what is the impact? Anitya couldn't check new versions of projects.
I think we need to allow application owner to delete jobs or pods, since this seems to be one of the first step when things are going wrong.
Is there any reason for not allowing application owner to delete pods ?
Deleted.
I have no objection to adding that to app-owner perms...
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Unfortunately the issue is still there. Here is the log from oc describe pod anitya-1547485200-gl782:
oc describe pod anitya-1547485200-gl782
FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 11h 38m 131 kubelet, os-node04.phx2.fedoraproject.org spec.containers{release-monitoring-web} Warning Failed Failed to pull image "release-monitoring/release-monitoring-web:latest": rpc error: code = Unknown desc = Error reading manifest latest in docker.io/release-monitoring/release-monitoring-web: errors: denied: requested access to the resource is denied unauthorized: authentication required 11h 28m 133 kubelet, os-node04.phx2.fedoraproject.org spec.containers{release-monitoring-web} Normal Pulling pulling image "release-monitoring/release-monitoring-web:latest" 11h 8m 2917 kubelet, os-node04.phx2.fedoraproject.org spec.containers{release-monitoring-web} Normal BackOff Back-off pulling image "release-monitoring/release-monitoring-web:latest" 11h 3m 2938 kubelet, os-node04.phx2.fedoraproject.org spec.containers{release-monitoring-web} Warning Failed Error: ImagePullBackOff
Not sure what is happening the cronjob image is now specified to same value as before the change that caused this. This is happening on staging and production :-(
The cronjob is pulling this image docker-registry.default.svc:5000/release-monitoring/release-monitoring-web:latest, which should be same as the frontend is using.
docker-registry.default.svc:5000/release-monitoring/release-monitoring-web:latest
@zlopez for some reason it is trying to pull the image from the dockerhub see docker.io here --> docker.io/release-monitoring/release-monitoring-web
docker.io/release-monitoring/release-monitoring-web
I was able to fix this in staging, when I manually changed the YAML, which still used the previous image path without docker-registry.default.svc:5000/ prefix. I hope the next scheduled job will use the new cron.yml instead of the old.
docker-registry.default.svc:5000/
But I can't edit yaml on production. So the issue is still there.
@cverna Yes, it is using the old cron.yml instead of new one. I'm not sure why
So I did a little experiment and ran the playbook again to be sure, that there is new cron.yml on openshift.
After this I deleted the pod on staging (It looks like I have the permissions for this on staging) and it was recreated immediately, but the pod was still using the old YAML config. Not sure what more I can do with it :-(
According to the @cverna we need to delete job first, which makes sense.
Unfortunately I can't do this by myself neither on staging or production. Could you restart the job @kevin?
Metadata Update from @zlopez: - Issue status updated to: Open (was: Closed)
After restart of the job (thanks @mizdebsk ), the issue is gone. I'm closing this again.
Metadata Update from @zlopez: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.