#12373 fedora-messaging times out when pushing to a fork on src.f.o
Closed: Fixed 16 days ago by dreua. Opened 17 days ago by dreua.

Describe what you would like us to do:

Hopefully someone here as an idea why this happens and how to fix it :wink:

I tried this on two different projects, the push succeeds eventually but I think this may have some consequences. It looks to me like the CI on a PR does not rerun for new commits.

Steps to reproduce

  1. Use any project with a fork on your user account
  2. Create a new branch or add commits to an existing one
  3. Push that branch to the fork

(I have not tested pushing to the main repository, that may affected as well I just can't test it right now.)

[da@David-UB pdfarranger (f41)]$ git switch -c testbranch
Switched to a new branch 'testbranch'
[da@David-UB pdfarranger (testbranch)]$ git push fork 
Enumerating objects: 125, done.
Counting objects: 100% (125/125), done.
Delta compression using up to 8 threads
Compressing objects: 100% (121/121), done.
Writing objects: 100% (121/121), 16.58 KiB | 808.00 KiB/s, done.
Total 121 (delta 65), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (65/65), completed with 1 local object.
remote:   - to fedora-message
remote: 2025-01-23 15:52:23,927 [WARNING] pagure.lib.notify: pagure is about to send a message that has no schemas: pagure.git.branch.creation
remote: 2025-01-23 15:52:53,932 [ERROR] pagure.lib.notify: Error sending fedora-messaging message
remote: Traceback (most recent call last):
remote:   File "/usr/lib/python3.6/site-packages/fedora_messaging/api.py", line 316, in publish
remote:     eventual_result.wait(timeout=timeout)
remote:   File "/usr/lib/python3.6/site-packages/crochet/_eventloop.py", line 239, in wait
remote:     result = self._result(timeout)
remote:   File "/usr/lib/python3.6/site-packages/crochet/_eventloop.py", line 201, in _result
remote:     raise TimeoutError()
remote: crochet._eventloop.TimeoutError
remote: 
remote: During handling of the above exception, another exception occurred:
remote: 
remote: Traceback (most recent call last):
remote:   File "/usr/lib/python3.6/site-packages/pagure/lib/notify.py", line 99, in fedora_messaging_publish
remote:     fedora_messaging.api.publish(msg)
remote:   File "/usr/lib/python3.6/site-packages/fedora_messaging/api.py", line 324, in publish
remote:     raise wrapper
remote: fedora_messaging.exceptions.PublishTimeout: Publishing timed out after waiting 30 seconds.
remote: Sending to redis to log activity and send commit notification emails
remote: * Publishing information for 37 commits
remote:   - to fedora-message
remote: 2025-01-23 15:52:55,060 [WARNING] pagure.lib.notify: pagure is about to send a message that has no schemas: pagure.git.receive
remote: 2025-01-23 15:53:25,061 [ERROR] pagure.lib.notify: Error sending fedora-messaging message
remote: Traceback (most recent call last):
remote:   File "/usr/lib/python3.6/site-packages/fedora_messaging/api.py", line 316, in publish
remote:     eventual_result.wait(timeout=timeout)
remote:   File "/usr/lib/python3.6/site-packages/crochet/_eventloop.py", line 239, in wait
remote:     result = self._result(timeout)
remote:   File "/usr/lib/python3.6/site-packages/crochet/_eventloop.py", line 201, in _result
remote:     raise TimeoutError()
remote: crochet._eventloop.TimeoutError
remote: 
remote: During handling of the above exception, another exception occurred:
remote: 
remote: Traceback (most recent call last):
remote:   File "/usr/lib/python3.6/site-packages/pagure/lib/notify.py", line 99, in fedora_messaging_publish
remote:     fedora_messaging.api.publish(msg)
remote:   File "/usr/lib/python3.6/site-packages/fedora_messaging/api.py", line 324, in publish
remote:     raise wrapper
remote: fedora_messaging.exceptions.PublishTimeout: Publishing timed out after waiting 30 seconds.
remote: 
remote: Create a pull-request for testbranch
remote:    https://src.fedoraproject.org/fork/dreua/rpms/pdfarranger/diff/rawhide..testbranch
remote: 
To ssh://pkgs.fedoraproject.org/forks/dreua/rpms/pdfarranger.git
 * [new branch]      testbranch -> testbranch

Here is another PR where I saw this first: https://src.fedoraproject.org/rpms/osm-gps-map/pull-request/2

Thanks in advance :)


(Also pagure and src.f.o seems a bit slow in the web interfaces to me. But that may be a totally different issue. Currently the status page is all green so I hope I didn't just missed some announced outage.)

(Just got a 500 error and now the web interface performance seems to be back to normal. The messaging timeout is still there unfortunately.)

COPR is also affected by not being able to send fedora-messaging messages:

[2025-01-23 16:18:42,209][  INFO][PID:310246] Sending fedora-messaging bus message in build.start
[2025-01-23 16:19:12,682][ ERROR][PID:310246] Attempt 1 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:19:42,688][ ERROR][PID:310246] Attempt 2 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:20:12,692][ ERROR][PID:310246] Attempt 3 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:20:42,695][ ERROR][PID:310246] Attempt 4 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:21:12,702][ ERROR][PID:310246] Attempt 5 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:21:12,705][  INFO][PID:310246] Sending fedora-messaging bus message in chroot.start
[2025-01-23 16:21:42,710][ ERROR][PID:310246] Attempt 1 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:22:12,713][ ERROR][PID:310246] Attempt 2 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:22:42,716][ ERROR][PID:310246] Attempt 3 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)
[2025-01-23 16:23:12,719][ ERROR][PID:310246] Attempt 4 to publish a message failed (in /usr/lib/python3.13/site-packages/fedora_messaging/api.py:321)

Yes, there was a outage this morning. A bunch of proxies dropped off the network and back on, and after that rabbitmq was unhappy.

Everything should be back to normal now. Can you confirm ?

Metadata Update from @phsmoura:
- Issue assigned to kevin
- Issue priority set to: Waiting on Reporter (was: Needs Review)
- Issue tagged with: high-gain, high-trouble, ops

16 days ago

Metadata Update from @dreua:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

16 days ago

Log in to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog