#8908 sync of ansible.git from pagure.io to batcave doesn't always work
Closed: Fixed 3 years ago by pingou. Opened 3 years ago by praiskup.


Hm this is "amusing":

on batcave01:

$ journalctl -lru mirror_pagure_ansible --since=2020-05-12 |grep /srv/web/infra/ansible
May 12 04:44:53 batcave01.phx2.fedoraproject.org
May 12 04:19:58 batcave01.phx2.fedoraproject.org
May 12 04:04:01 batcave01.phx2.fedoraproject.org
May 12 00:00:36 batcave01.phx2.fedoraproject.org

on batcave13:

# journalctl -lru mirror_pagure_ansible --since=2020-05-12 |grep /srv/web/infra/ansible
May 12 04:44:49 batcave13.rdu2.fedoraproject.org
May 12 04:32:31 batcave13.rdu2.fedoraproject.org
May 12 04:19:54 batcave13.rdu2.fedoraproject.org
May 12 04:14:14 batcave13.rdu2.fedoraproject.org
May 12 04:03:59 batcave13.rdu2.fedoraproject.org
May 12 00:00:35 batcave13.rdu2.fedoraproject.org

So batcave01 "missed" 2 messages that batcave13 saw. I'm wondering if there is something up with the queue that batcave01 uses since it's the one that was used when setting things up.

Some more debugging info:

on rabbitmq01:

# rabbitmqctl list_consumers -p /pubsub |grep mirror
mirror_pagure_ansible_13    <rabbit@rabbitmq02.phx2.fedoraproject.org.2.28264.3077> 6c08520f-8634-4154-9871-55aa1b228054    true    0   []
mirror_pagure_ansible   <rabbit@rabbitmq03.phx2.fedoraproject.org.2.27068.5779> c29fbadb-8ee9-498b-84da-d92de20785ef    true    0   []
mirror_pagure_ansible   <rabbit@rabbitmq03.phx2.fedoraproject.org.2.22890.6304> 139c705a-fe5c-4004-96fb-137485f98d77    true    0   []

So it looks like we have two consumer for the queue that batcave01 uses (in which case messages are sent to the consumers in a round robin fashion).
This explains the behaviour we're seeing but I'm confused as to how it got this way :(

@abompard and I have scheduled sometime to dive a little more into this as currently I don't understand what's going on

Metadata Update from @pingou:
- Issue assigned to pingou
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: rabbitmq

3 years ago

ahhhh. I bet I know. :)

I stood up a batcave01.iad2.fedoraproject.org I wonder if this one connected to the same queue as batcave01.phx2?

\รณ/

So I've adjusted ansible so that the 3 bat caves have 3 different queues and this is the current outcome:

# rabbitmqctl list_consumers -p /pubsub |grep mirror
mirror_pagure_ansible_13    <rabbit@rabbitmq02.phx2.fedoraproject.org.2.28264.3077> 6c08520f-8634-4154-9871-55aa1b228054    true    0   []
mirror_pagure_ansible   <rabbit@rabbitmq03.phx2.fedoraproject.org.2.7017.6347>  bc047838-138d-41e1-87d7-9fa0ccd2981f    true    0   []
mirror_pagure_ansible_iad2  <rabbit@rabbitmq02.phx2.fedoraproject.org.2.5191.3578>  67db4678-8e91-4bdb-81f2-6cce227135f1    true    0   []

So for me, this should be fixed

From what I can see, the last two commits pushed to the ansible repo made it on all three machines.

Let's close this and re-open if needed.

Thanks for raising this @praiskup !

Metadata Update from @pingou:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata