#7818 greenwave in staging is piling messages
Closed: Fixed 10 months ago by gnaponie. Opened 10 months ago by pingou.

@kevin reported on #7817 that greenwave in staging has 155k+ messages in its queue, so the consumer may have an issue.

Source: https://pagure.io/fedora-infrastructure/issue/7817#comment-571406


The fedmsg consumer pod is crashing. Is it expected to still run this with the move to fedora-messaging?

Anyhow, it's crashing with:

[greenwave.consumers.resultsdb INFO] Greenwave resultsdb handler listening on: ['org.fedoraproject.staging.taskotron.result.new']
[greenwave.consumers.fedora_messaging_consumer INFO] Sending message received to: ResultsDBHandler
[greenwave.consumers.resultsdb DEBUG] Processing message "{'msg': {'task': {'type': 'bodhi_update', 'item': 'FEDORA-2019-9244c8b209', 'name': 'update.advisory_boot'}, 'result': {'id': 23523568, 'submit_time': '2019-04-24 13:06:12 UTC', 'prev_outcome': None, 'outcome': 'PASSED', 'log_url': 'https://openqa.stg.fedoraproject.org/tests/528801' }}}"
[fedora_messaging.twisted.protocol ERROR] Received unexpected exception from consumer Consumer(queue=greenwave.stg, callback=<function fedora_messaging_callback at 0x7fbeb1c099d8>)
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/fedora_messaging/twisted/protocol.py", line 239, in _read
    yield threads.deferToThread(consumer.callback, message)
  File "/usr/lib64/python3.7/site-packages/twisted/python/threadpool.py", line 250, in inContext
    result = inContext.theWork()
  File "/usr/lib64/python3.7/site-packages/twisted/python/threadpool.py", line 266, in <lambda>
    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/usr/lib64/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/lib64/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/usr/local/lib/python3.7/site-packages/greenwave/consumers/fedora_messaging_consumer.py", line 61, in fedora_messaging_callback
    handler.consume(msg)
  File "/usr/local/lib/python3.7/site-packages/greenwave/consumers/resultsdb.py", line 247, in consume
    submit_time = message['msg']['submit_time']
KeyError: 'submit_time'
[fedora_messaging.cli ERROR] Unexpected error occurred in consumer Consumer(queue=greenwave.stg, callback=<function fedora_messaging_callback at 0x7fbeb1c099d8>): <twisted.python.failure.Failure builtins.KeyError: 'submit_time'>
[fedora_messaging.twisted.protocol INFO] Waiting for 0 consumer(s) to finish processing before halting
[fedora_messaging.twisted.protocol INFO] Finished canceling 0 consumers
[fedora_messaging.twisted.protocol INFO] Disconnect requested, but AMQP connection already gone

Working on this. Any idea why is the submit_time format different from what is in resultsdb [1]?

Currently, format %Y-%m-%dT%H:%M:%S.%f is hard-coded.

[1] https://taskotron.stg.fedoraproject.org/resultsdb_api/api/v2.0/results/23523568

Working on this. Any idea why is the submit_time format different from what is in resultsdb [1]?

Looks like, it's just resultsdb differently serializes data/time for fedora-messaging. We need to handle this case in waiverdb.

The patch helped but it seems the consumer is now blocked by dogpile (maybe memcached pod problem).

[greenwave.consumers.resultsdb INFO] Greenwave resultsdb handler listening on: ['org.fedoraproject.staging.taskotron.result.new']
[greenwave.consumers.fedora_messaging_consumer INFO] Sending message received to: ResultsDBHandler
[greenwave.consumers.resultsdb DEBUG] Processing message "{'msg': {'task': {'type': 'koji_build', 'item': 'graphviz-2.40.1-39.fc29', 'name': 'dist.rpmgrill.elf-checks'}, 'result': {'id': 23526664, 'submit_time': '2019-04-24 14:40:06 UTC', 'prev_outcome': None, 'outcome': 'FAILED', 'log_url': 'https://taskotron.stg.fedoraproject.org/artifacts/all/748d0752-669e-11e9-8059-5254003dbbef/tests.yml/rpmgrill.json' }}}"
[greenwave.consumers.resultsdb DEBUG] Considering subject koji_build: 'graphviz-2.40.1-39.fc29'
[greenwave.consumers.resultsdb DEBUG] Invalidating cache for 'greenwave.resources:CachedResults|koji_build graphviz-2.40.1-39.fc29 dist.rpmgrill.elf-checks'
[greenwave.consumers.resultsdb DEBUG] Invalidating cache for 'greenwave.resources:CachedResults|koji_build graphviz-2.40.1-39.fc29 None'
[dogpile.lock DEBUG] NeedRegenerationException
[dogpile.lock DEBUG] no value, waiting for create lock

Would be nice to also have health check endpoint for the consumers (for livenessProbe in OpenShift).

Login to comment on this ticket.

Metadata
Related Pull Requests
  • #438 Merged 10 months ago