#7630 Growing fedmsg backlog on badges-backend
Closed: Fixed 6 months ago by cverna. Opened 6 months ago by mizdebsk.

Describe what you need us to do:

Looks like fedmsg-hub on badges-backend is not processing messages - fedmsg consumer FedoraBadgesConsumer backlog value is growing, current value is around 300k of unprocessed messages. I've tried to investigate that issue, but I can't find the cause.

@sayanchowdhury @cverna, can you please check what's going on?

  • Reproducer:
    [nrpe@badges-backend01 ~][PROD]$ /usr/lib64/nagios/plugins/check_fedmsg_consumer_backlog.py fedmsg-hub FedoraBadgesConsumer 10 50
  • Actual output:
    CRITICAL: fedmsg consumer FedoraBadgesConsumer backlog value is 318397

When do you need this? (YYYY/MM/DD)


If we cannot complete your request, what is the impact?

If messages are lost without being processed, people may not receive expected badges.

I have removed the helping-hand.yml badge from /usr/share/badges/rules since this badge tries to connect to pkgdb and keeps crashing (This needs to be done properly in fedora-badges git repo and then synced to ansible).

I have restarted fedmsg-hub and let it go through the 3371 pages out of 3385 pages of datagrepper at this point the VM did not have any more free memory or swap and fedmsg-hub was stuck. To move forward I delete the entry in /var/run/fedmsg/status/fedmsg-hub/FedoraBadgesConsumer so we skip the backlog (sorry for the unawarded bagdes).

Then restarted the fedmsg-hub again and it seems to process the badges again.

Metadata Update from @cverna:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 months ago

Login to comment on this ticket.