#11023 fedmsg-gateway-3 broken after upgrade to fedora 37
Closed: Fixed with Explanation a year ago by kevin. Opened a year ago by kevin.

The proxies were upgraded to fedora-37 and fedmsg-gateway is broken.

Dec 01 00:30:03 proxy01.iad2.fedoraproject.org systemd[1]: Started fedmsg-gateway-3.service - Outbound
 fedmsg gateway.
Dec 01 00:30:05 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: Cannot find qpid python module. Make sure you have python-qpid installed.
Dec 01 00:30:06 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:06][moksha.hub    INFO] Loading the Moksha Hub
Dec 01 00:30:06 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:06][moksha.hub WARNING] No 'zmq_publish_endpoints' set.  Are you sure?
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Loading Consumers
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][fedmsg.consumers    INFO]   enabled by config  - fedmsg.consumers.gateway:GatewayConsumer
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Blocking mode false for <fedmsg.consumers.gateway.GatewayConsumer object at 0x7f740bdfa410>.  Messages to be queued and distributed to 1 threads.
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][fedmsg.consumers    INFO] No backlog handling.  status: None, url: None
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][fedmsg.consumers    INFO] Setting up special gateway socket on port 9942
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub   ERROR] Failed to init <class 'fedmsg.consumers.gateway.GatewayConsumer'> consumer.
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: Traceback (most recent call last):
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "/usr/lib/python3.11/site-packages/moksha/hub/hub.py", line 410, in __init_consumers
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:     c = c_class(self)
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:         ^^^^^^^^^^^^^
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "/usr/lib/python3.11/site-packages/fedmsg/consumers/gateway.py", line 45, in __init__
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:     self._setup_special_gateway_socket()
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "/usr/lib/python3.11/site-packages/fedmsg/consumers/gateway.py", line 62, in _setup_special_gateway_socket
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:     self.gateway_socket.setsockopt(zmq.HWM, hwm)
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "zmq/backend/cython/socket.pyx", line 447, in zmq.backend.cython.socket.Socket.set
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "zmq/backend/cython/socket.pyx", line 279, in zmq.backend.cython.socket._setsockopt
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]:   File "zmq/backend/cython/checkrc.pxd", line 28, in zmq.backend.cython.checkrc._check_rc
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: zmq.error.ZMQError: Invalid argument
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Loading Producers
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Initializing MonitoringProducer producer
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub.monitoring    INFO] Establishing monitor sock at 'ipc:///var/run/fedmsg/monitoring-fedmsg-gateway-.socket'
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Running the MokshaHub reactor
Dec 01 00:30:07 proxy01.iad2.fedoraproject.org fedmsg-gateway-3[847]: [2022-12-01 00:30:07][moksha.hub    INFO] Suggesting threadpool size at 2

I found a workaround. I downgraded all the proxies to python3-zmq-22.3.0-3.fc37.x86_64 from 23.2.0-1.fc37 and that got them all working.

So, it's likely some change in zmq between 22.x and 23.x that we need to adjust for?

Metadata Update from @phsmoura:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, ops

a year ago

I found this upstream:

https://github.com/zeromq/pyzmq/issues/1139

Not sure it's related. The error above is in setsockopt(zmq.HWM, hwm), whereas the ticket talks about DISH async sockets. Yet, where it fails, is almost identical:

server_1  |   File "zmq/backend/cython/socket.pyx", line 264, in zmq.backend.cython.socket._getsockopt
server_1  |   File "zmq/backend/cython/checkrc.pxd", line 25, in zmq.backend.cython.checkrc._check_rc
server_1  | zmq.error.ZMQError: Invalid argument

Might be worthwhile reporting it upstream. I could also dig into the code to see if that provides more context.

I was able to create a minimal reproducer for this issue:

import zmq 

context = zmq.Context(1)

gateway_socket = context.socket(zmq.PUB)

hwm = 160000

gateway_socket.setsockopt(zmq.HWM, hwm)

I tried to change the high_water_mark value to different numbers and it's failing no regard what you set. It is definitely issue in zmq library. I will report it upstream.

After being corrected on upstream, I found out that the root of this issue is on this line.

It seems that if hasattr(zmq, 'HWM') still returns true, even if it's not usable anymore.

And here is a PR to fix this issue upstream.

This should be fixed with the version 1.1.6, which is now in testing.

I can confirm this is fixed in staging and should be fixed when we update proxies in the next few days.

Many thanks!

Metadata Update from @kevin:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a year ago

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog