#8453 2 odcs alerts
Closed: Fixed 4 years ago by cverna. Opened 4 years ago by kevin.

odcs-backend01 has two nagios alerts, we should fix whatever is causing them:

  1. A alert for 'Check for fedmsg-hub-3 proc'. If this service has moved to fedora-messaging,, we should remove this check.

  2. Disk_Space_/ It looks like it's unable to remove old composes, as they have a different uid. Perhaps a chown needs to be done?

CC members of sysadmin-odcs: @cverna @jkaluza @lsedlar @mizdebsk


@kevin, please remove the fedmsg-hub-3 process check, it uses celery now. You can add check for "odcs-celery-backend" systemd service status instead.

@jkaluza "Check for fedmsg-hub-3 proc" check checks for odcs-celery-backend process

The logs of the worker are getting filled by the following

[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo/repodata/3f4c84c31509db46efd766d692846ebe78e87d4bf751d99fc334cc3677c55b86-other.xml.gz: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo/repodata/fcdc749b2e2dae6a0568a7036a9bd0e0c72dea8ca2f43ff907b34bd73094e7a4-primary.xml.gz: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo/repodata/repomd.xml: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo/repodata: "OSError(39, 'Directory not empty')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo: "OSError(39, 'Directory not empty')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo_package_list/Temporary.x86_64.debuginfo.conf: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,363: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo_package_list/Temporary.x86_64.rpm.conf: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64/repo_package_list: "OSError(39, 'Directory not empty')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work/x86_64: "OSError(39, 'Directory not empty')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/work: "OSError(39, 'Directory not empty')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/COMPOSE_ID: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0/STATUS: "PermissionError(13, 'Permission denied')"
[2019-12-18 14:54:30,364: WARNING/ForkPoolWorker-2] Cannot remove some files in /srv/odcs/odcs-292-1-20190509.n.0: "OSError(39, 'Directory not empty')"

It looks like the cleanup tasks does not have the permission to delete these repositories. Looking at /srv/odcs/ at lot of the directories are owned by :

drwxr-xr-x. 5    986 dnsmasq 4096 Nov  2 18:58 odcs-653-1-20191102.n.0
drwxr-xr-x. 5    986 dnsmasq 4096 Nov  2 21:08 odcs-655-1-20191102.n.0
drwxr-xr-x. 5    986 dnsmasq 4096 Nov  3 12:06 odcs-658-1-20191103.n.0
drwxr-xr-x. 5    986 dnsmasq 4096 Nov  3 13:28 odcs-659-1-20191103.n.0
drwxr-xr-x. 5    986 dnsmasq 4096 Nov  5 12:36 odcs-660-1-20191105.n.0
drwxr-xr-x. 5    986 dnsmasq 4096 Nov  5 12:38 odcs-661-1-20191105.n.0

Running chown -R odcs:fedmsg odcs-* on /srv/odcs and restart the celery service seems to have done the trick.

Metadata Update from @cverna:
- Issue assigned to cverna

4 years ago

Metadata Update from @cverna:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

4 years ago

Login to comment on this ticket.

Metadata