I'm trying to run Bodhi's openshift playbook, and it fails with this error:
TASK [rabbit/user : Create the user in RabbitMQ] ******************************************************** Tuesday 21 May 2019 14:39:37 +0000 (0:00:00.064) 0:00:00.164 *********** fatal: [os-master01.stg.phx2.fedoraproject.org]: FAILED! => {"changed": false, "cmd": "/usr/sbin/rabbitmqctl -q -n rabbit list_users", "msg": "Error:********@rabbitmq01.stg.phx2.fedoraproject.org'\n- home dir: /var/lib/rabbitmq\n- cookie hash: FxutaR0KvbmZYeAyJEBcKA==", "rc": 69, "stderr": "Error: unable to connect to node 'rabbit@rabbitmq01.stg.phx2.fedoraproject.org': nodedown\n\nDIAGNOSTICS\n===========\n\nattempted to contact: ['rabbit@rabbitmq01.stg.phx2.fedoraproject.org']\n\nrabbit@rabbitmq01.stg.phx2.fedoraproject.org:\n * connected to epmd (port 4369) on rabbitmq01.stg.phx2.fedoraproject.org\n * epmd reports: node 'rabbit' not running at all\n no other nodes on rabbitmq01.stg.phx2.fedoraproject.org\n * suggestion: start the node\n\ncurrent node details:\n- node name: 'rabbitmq-cli-36@rabbitmq01.stg.phx2.fedoraproject.org'\n- home dir: /var/lib/rabbitmq\n- cookie hash: FxutaR0KvbmZYeAyJEBcKA==\n\n", "stderr_lines": ["Error: unable to connect to node 'rabbit@rabbitmq01.stg.phx2.fedoraproject.org': nodedown", "", "DIAGNOSTICS", "===========", "", "attempted to contact: ['rabbit@rabbitmq01.stg.phx2.fedoraproject.org']", "", "rabbit@rabbitmq01.stg.phx2.fedoraproject.org:", " * connected to epmd (port 4369) on rabbitmq01.stg.phx2.fedoraproject.org", " * epmd reports: node 'rabbit' not running at all", " no other nodes on rabbitmq01.stg.phx2.fedoraproject.org", " * suggestion: start the node", "", "current node details:", "- node name: 'rabbitmq-cli-36@rabbitmq01.stg.phx2.fedoraproject.org'", "- home dir: /var/lib/rabbitmq", "- cookie hash: FxutaR0KvbmZYeAyJEBcKA==", ""], "stdout": "", "stdout_lines": []}
It is using the wrong host: rabbit@rabbitmq01.stg.phx2.fedoraproject.org, it should be using rabbitmq.stg.fedoraproject.org
Other apps have been adjusted: https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/?id=1a09cff25c3afbabf9b4d2b7856827b5f334c1b1
This is the rabbitmq role that's failing; Bodhi's config file is not involved.
Should be fixed. It looks like it might be an ordering issue in our reboots... 01 came up and didn't find the others? will have to investigate...
[root@rabbitmq01 log][STG]# rabbitmqctl list_queues --online Listing queues test 0
should there be more queues?
You want probibly:
rabbitmqctl list_queues -p /pubsub --online
Listing queues amqp_to_zmq 0 faf 0 the-new-hotness.stg 0 bodhi.stg_composer 0 bodhi.stg 2 federation: zmq.topic -> rabbit@rabbitmq01.stg.phx2.fedoraproject.org:/public_pubsub:zmq.topic 0 greenwave.stg 155175 amqp_bridge_verify_missing 0 federation: amq.topic -> rabbit@rabbitmq01.stg.phx2.fedoraproject.org:/public_pubsub:amq.topic 0
Yeah seems to be back now - thanks!
I landed a fix in our reboot playbook to restart these after reboots are done and the hosts are back.
Metadata Update from @kevin: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.