#11339 resultsdb-ci-listener is down
Closed: Fixed 2 years ago by kevin. Opened 2 years ago by lholecek.

Our team receives a lot of PodCrashLoop alerts recently about resultsdb-ci-listener, though we have not maintained this service before (we maintain resultsdb).

Logs:

[fedora_messaging.cli INFO] Starting consumer with resultsdb_listener.consumer:Consumer callback
[fedora_messaging.twisted.service INFO] Authenticating with server using x509 (certfile: /etc/pki/rabbitmq/crt/resultsdb-ci-listener.crt, keyfile: /etc/pki/rabbitmq/key/resultsdb-ci-listener.key)
[fedora_messaging.cli ERROR] Unable to declare the binding object on the AMQP broker. The broker responded with (403, "ACCESS_REFUSED - access to queue 'resultsdb_ci_listener' in vhost '/pubsub' refused for user 'resultsdb'"). Check permissions for your user.
[fedora_messaging.twisted.protocol INFO] Waiting for 0 consumer(s) to finish processing before halting
[fedora_messaging.twisted.protocol INFO] Finished canceling 0 consumers
[fedora_messaging.twisted.protocol INFO] Disconnect requested, but AMQP connection already gone

Is this service still used/needed? AFAIK it submits test result messages to resultsdb: https://pagure.io/ci-resultsdb-listener


My first thought was to look if the certificate is valid, but found out that there isn't any resultsdb-ci-listener.crt in ansible-private.

@kevin Do you know where the certificate is?

Metadata Update from @zlopez:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: Needs investigation, rabbitmq

2 years ago

Yes, it should be the same cert.

The username/rabbitmq perms might be... not right.

Yeah, so I fixed the permissions manually, but we need to see why it's not right in ansible. ;(

I ran:

` rabbitmqctl set_permissions resultsdb --vhost /pubsub "^$" "^(amq\.topic)|(resultsdb_ci_listener.*)$" " ^(zmq\.topic)|^(amq\.topic)|(resultsdb_ci_listener.*)$"

Thanks, the resultsdb-ci-listener service runs fine now.

Isn't there conflict with how the permissions are set for both resultsdb and resultsdb-ci-listener?

Not sure if the previously set permissions would be revoked if there is second command: rabbitmqctl set_permission resultsdb ....

Resultsdb requires mainly to publish on org.fedoraproject.*.resultsdb.result.new topic (and consumes from none), whereas resultsdb-ci-listener does not sent to any topic (I guess that is the "^$" in the rabbitmqctl set_permissions command) but requires to consume from org.centos.prod.ci.koji-build.test.complete (among others).

Actually, I still see the problem in staging environment (we still receive alerts).

I fixed staging also this weekend.

We need to see why these perms aren't getting set right by ansible.

This is still problem in staging. The pod is currently in CrashLoopBackOff state.

Not sure if what is the magic behind generating the key/crt for messaging, but maybe this helps: https://pagure.io/fedora-infra/ansible/pull-request/1493

Metadata Update from @kevin:
- Issue assigned to kevin

2 years ago

I think this is solved now.

Please re-open if there's anything further to do.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Log in to comment on this ticket.

Metadata