#492 Zuul is not triggered on Fedora PRs
Closed 5 months ago by fbo. Opened 5 months ago by ksurma.

Looking at builds: https://fedora.softwarefactory-project.io/zuul/builds it seems the last builds were triggered 2024-10-01 21:25:51 (Python 3.12) and then nothing. I wanted to run Zuul again on https://src.fedoraproject.org/rpms/python3.13/pull-request/114 now and nothing happens: the job is not added to the queue even after ~10 minutes.


Hi,

The issue was that the fm-gateway (which is the component connecting to fedmsg and relaying messages to Zuul) was unable to "connect" on the message bus.

Unhandled error in Deferred:                                                                              

Traceback (most recent call last):                                                                        
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 1475, in gotResult
    _inlineCallbacks(r, g, status)
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 1464, in _inlineCallbacks
    status.deferred.errback()
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 501, in errback
    self._startRunCallbacks(fail)
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 568, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 329, in on_ready_connection_errback
    r = failure.trap(
  File "/usr/lib64/python3.9/site-packages/twisted/python/failure.py", line 460, in trap
    self.raiseException()
  File "/usr/lib64/python3.9/site-packages/twisted/python/failure.py", line 488, in raiseException
    raise self.value.with_traceback(self.tb)
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/lib64/python3.9/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/factory.py", line 323, in on_ready
    yield client.declare_queues([queue])
  File "/usr/lib64/python3.9/site-packages/twisted/internet/defer.py", line 1416, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/lib64/python3.9/site-packages/twisted/python/failure.py", line 512, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/lib/python3.9/site-packages/fedora_messaging/twisted/protocol.py", line 553, in declare_queues
    raise BadDeclaration("queue", args, e)
fedora_messaging.exceptions.BadDeclaration: Unable to declare the queue object ({'queue': '8ccf2c4f-fc20-4785-af05-e695ad8665df', 'durable': True, 'auto_delete': False, 'exclusive': False, 'arguments': {}, 'passiv
e': False}) because (404, "NOT_FOUND - home node 'rabbit@rabbitmq02.iad2.fedoraproject.org' of durable queue '8ccf2c4f-fc20-4785-af05-e695ad8665df' in vhost '/public_pubsub' is down or inaccessible")

I've restarted the service and now messages are forwarded as expected and jobs are triggered.

It is not the first time that issue happen after the service was running for some weeks, then the events loop seems to be stuck. Any ideas, or improvements to avoid this issue would be welcome https://pagure.io/fm-gateway/tree/master :)

Perhaps the first step here is to prepare a new container with up to date dependencies.

Thanks for the report.

Metadata Update from @fbo:
- Issue status updated to: Closed (was: Open)

5 months ago

Log in to comment on this ticket.

Metadata