#8846 FedoraMessaging is missing messages
Closed: Fixed a year ago by pingou. Opened 2 years ago by bgoncalv.

We recently moved the triggers of our pipeline to listen on FedoraMessaging instead of FedMsg, but it looks like some message are missing as reported on https://pagure.io/fedora-ci/general/issue/107

Example of message that we didn't trigger seems to be https://apps.fedoraproject.org/datagrepper/id?id=2020-5ae30007-38af-43a4-ba70-1f1c0ae1b49b&is_raw=true&size=extra-large

I couldn't find it on pur pipeline job: https://jenkins-continuous-infra.apps.ci.centos.org/view/Fedora%20All%20Packages%20Pipeline/job/fedora-pr-comment-trigger/


Could I have some more context please?

  • Who is sending which messages, and on which network?
  • Who is subscribed to which messages, and on which networks?
  • Is it on staging or on production?
  • What does your fedora-messaging configuration file look like?

Thanks.

Could I have some more context please?

Who is sending which messages, and on which network?

In this case are are listening on org.fedoraproject.prod.pagure.pull-request.comment.added topic, it is sent by pagure.

Who is subscribed to which messages, and on which networks?

Our Jenkins is https://jenkins-continuous-infra.apps.ci.centos.org

topic name: org.fedoraproject.prod.pagure.pull-request.comment.added
queue: ''
Message checks:
$.pullrequest.project.namespace => rpms|tests
$.pullrequest.status => Open

Timeout 60

Is it on staging or on production?

production

What does your fedora-messaging configuration file look like?

Hostname: rabbitmq.fedoraproject.org
Port number: 5671
Virtual Host: /pubsub
Topic: org.centos.ci
Exchange: amq.topic
Queue: centos-ci

Thanks.

OK, I see that the org.centos.ci queue is indeed subscribed to the org.fedoraproject.prod.pagure.pull-request.comment.added topic, so the queue should get all messages published on that topic. The queue is currently empty so there is no backlog. There's a spike of traffic around 09:20 UTC, maybe there was something blocked then?

I also see 8 consumers for that queue, is it normal? When a queue has multiple consumers, messages are distributed in a round robin fashion, could it be that one of those consumers has an issue or does not know how to handle this topic?

I also see 8 consumers for that queue, is it normal? When a queue has multiple consumers, messages are distributed in a round robin fashion, could it be that one of those consumers has an issue or does not know how to handle this topic?

This could be the problem, it was suppose to have just 1 consumer. How can we change the queue? Do we need to set queue parameter on the trigger? If so, what is the format it expects?

Hmm I don't know much about the Java client that Jenkins uses, maybe @zlopez would know better?

@bgoncalv You should have option to override the queue for each consumer using overrides.

@abompard it seems we need to set queue for each trigger we have https://github.com/CentOS-PaaS-SIG/upstream-fedora-pipeline/blob/master/JenkinsfilePRCommentTrigger#L31 do you know what queue values we can use?

I would suggest something starting with centos-ci-, and it need to be declared and deployed from Ansible (in roles/rabbitmq_cluster/tasks/apps.yml).

So we do have 8 triggers jobs on FedoraMessaging prod, does it mean we need to request 8 new queues? And also 3 other queues for our 3 triggers on stage FedoraMessaging ?

Metadata Update from @cverna:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

2 years ago

@abompard assigned this ticket to you, that helps us tracking what is in progress versus what is not.

Feel free to remove yourself if you are not working on this any more :smile:

Metadata Update from @cverna:
- Issue assigned to abompard

2 years ago

@bgoncalv I'm not sure what you're trying to do and how the jenkins plugin works. What we usually do is have one queue subscribed to multiple topics, and the consumer of that queue does a big if on the incoming message's topic to dispatch to the proper handler function.

It would also work to have multiple queues subscribed to one topic each, and each consumer of these queues does only one thing.

@abompard thanks, having 1 queue for each trigger job would make our maintenance more complicated. @zlopez as we have multiple triggers in our Jenkins, for the plugin to work well do we need to have a different queue to each of them?

1 topic (producer) can send to many queues (consumers).
All queues (consumers) must receive message from producer.

In AMQ world we can create multiple queues by changing $CLIENT_NAME for each queue:
Consumer.$CLIENT_NAME.$CONSUMER_NAME.VirtualTopic.$HIERARCHY
We use the same credential(cert/key) while connecting to AMQ from multiple places .

Now in rabbitmq:
We have 1 credential (cert/key) for with name: org.centos.ci.
This 1 credential we want to use on different VMs/containers/scripts at the same time.

Is this possible? If yes, how? Any link to document/example would be useful.
Thank you.

Question: how in rabbitmq-world do the same as we do in AMQ-world?

After digging into the source code of the Jenkins jms-messaging-plugin, it seems like there are two places where to declare queue names:

  • in global configuration, single queue for the whole provider
    • this is where we have the default centos-ci queue configured now
  • in individual jobs, via queue override option

If queue name is empty in both places, the plugin will attempt to create a new queue with a random name in the broker, and bind it to the exchange configured in the global configuration (for that specific provider).

The gotcha part is that if specific queue names are declared in either global config, or as a queue override in jobs, then the plugin assumes that the queue already exists, and it won't attempt to create it.

And of course, if multiple jobs rely on the globally configured queue name, the plugin does the round robin dispatching as @abompard correctly pointed out in his comment above.

So I see 2 options here:

  1. giving us permissions to create queues with random names (technically it is letting the broker to generate queues with random names for us — that's how it should behave if one asks for a queue with an empty name). Maybe this is already possible (?), we just haven't tried

  2. we will give you a list of queues that we need, you'd create them for us, and the certificate given to us would have permissions to access all those queues (so we can have just a single provider configured in Jenkins)

I guess that from infra perspective, the second option would be preferable?

Yes, I think 2 would be better for us. ;)

@abompard any other input?

Another friendly ping =)

Lets go with 2. Can you attach the list of queues? or submit a pr against https://pagure.io/fedora-infra/ansible and roles/rabbitmq_cluster/tasks/apps.yml if you prefer.

Ah sorry I missed that thread. Yeah clients are not allowed to create queues in our setup, so option 2 is better.

I don't know how AMQ works, but the AMQP world seems to have more entities:
- producers send messages to an exchange with a topic (called routing_key in AMQP)
- queues can subscribe to topics (wildcards possible) by binding to an exchange (the topic / routing_key is a parameter of that binding). Queues can subscribe to multiple topics, and topics can be subscribed to by multiple queues.
- consumers get messages from queues over the network. If multiple consumers are consuming from the same queue, they get messages in a round-robin fashion. If you want to have multiple consumers getting the same messages, you have to use multiple queues.

I hope that's clear, the RabbitMQ docs are pretty good I think, and feel free to ask me if you have questions on what you are specifically trying to accomplish.

Metadata Update from @cverna:
- Assignee reset

2 years ago

Metadata Update from @pingou:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

a year ago

Login to comment on this ticket.

Metadata