#7940 Request for ability to publish messages for CoreOS builds via Fedora Messaging
Closed: Fixed 4 years ago by kevin. Opened 4 years ago by dustymabe.

In the Fedora CoreOS working group we'd like to tie in to Fedora infra/releng processes to get artifacts signed and also uploaded to certain locations. In order to do this we'd like to leverage the same messaging bus that most Fedora apps/processes use in order to pass information around.

In order to publish these messages we need TLS certs so we have the proper authentication. Our build processes are currently running in CentOS CI so we'd probably need the files to be passed to us directly rather than being created in the ansible private repo.

We don't quite have the exact details of the messages we'd be sending just yet, but they'd need to contain information such that we could get ostree commits and image artifacts signed (see https://pagure.io/fedora-infrastructure/issue/7884).

A proposal for these messages could be something like:

  • org.fedoraproject.prod.coreos.build.stage.start

    • shares information about the COSA build moving through the pipeline
    • similar to pungi-compose-phase-start
    • should deliver the information in the meta.json from a fedora coreos pipeline run
    • can be used to trigger ostree commit signing
    • can be used to trigger artifact signing
    • can be used to trigger transfer ostree to unified repo
  • org.fedoraproject.prod.coreos.build.status.change

    • shares information about status of the current coreos COSA build in jenkins (i.e. whether it failed or not)
    • similar to pungi-compose-status-change
    • should deliver the information in the meta.json from a fedora coreos pipeline run

where the body of each message has information about the current state of the build process and the relevant information for signing could be gleaned out. These are just a proposal right now. Some more discussion in https://github.com/coreos/fedora-coreos-tracker/issues/198#issuecomment-505534354


@puiterwijk do you know if robosig can inspect information out of a message and ignore it (i.e. we use org.fedoraproject.prod.coreos.build.stage.start messages and robosig ignores if the stage is not the one it cares about) or if it requires a specific topic (i.e. it can't ignore messages in a topic).

I believe CentOS CI already has a TLS cert for fedora-messaging.

@bstinson @siddharthvipul1 do you need another one? From what I remember, CentOS CI is publishing to a central bridge that is the one signing and sending to fedora-messaging/fedmsg. In which case the existing cert is sufficient, no?

@puiterwijk do you know if robosig can inspect information out of a message and ignore it (i.e. we use org.fedoraproject.prod.coreos.build.stage.start messages and robosig ignores if the stage is not the one it cares about) or if it requires a specific topic (i.e. it can't ignore messages in a topic).

Not knowing the inner of robosig but as a general rule about message bus, the more precise the topic, the easier it is on the consumer as it allows to do the filtering of the message at the broker level rather than in the client (sync of it as doing a filtering in a SQL query vs select * and filtering in the code).
Otherwise, it will get all the messages about that topic and will need to check the content of the message to determine if it should act on it or ignore it.

@pingou, no this is not sufficient. They need their own cert

Thanks @bstinson.

He also clarified in IRC that those certs are for fedora CI pipelines specifically and not for infra/release related things.

Who can set us up a cert for coreos messages to be sent to fedora-messaging?

Metadata Update from @dustymabe:
- Issue priority set to: Next Meeting (was: Needs Review)

4 years ago

Who can set us up a cert for coreos messages to be sent to fedora-messaging?

The person who will take on this ticket :)

I can generate this by tomorrow close of business.

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Next Meeting)

4 years ago

ok. I have created two certs. One for staging and one for production. I called them 'coreos' which I hope is ok.

I have placed them under batcave01:~dustymabe/coreos-fedora-messaging-certs/

Let us know if there's anything further you need or I forgot to issue.

:guardsman:

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

4 years ago

So, trying to use fedora-messaging --consume using the coreos stg creds, I get:

[ERROR fedora_messaging.cli] The TCP connection appears to have started, but the TLS or AMQP handshake with the broker failed; check your connection and authentication parameters and ensure your user has permission to access the vhost

Config:

amqp_url = "amqps://fedora:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "coreos.stg.key"
certfile = "coreos.stg.crt"

Using the public creds work fine though:

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "/etc/fedora-messaging/fedora.stg-key.pem"
certfile = "/etc/fedora-messaging/fedora.stg-cert.pem"

-->

[INFO fedora_messaging.twisted.protocol] Successfully registered AMQP consumer Consumer(queue=1ff9c37f-21af-4b9c-a2f7-91002f4e945f, callback=<function printer at 0x7ff4a18aad90>)

@jlebon The amqp_url needs to include the username (the value of the CN in the certificate). So if your username is "coreos" you need the URL to be:

"amqps://coreos:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"

You can dump the cert with "openssl x509 -in coreos.stg.crt -text". That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

Hmm, so I see:

Subject: CN = coreos.stg

So I tried:

amqp_url = "amqps://coreos.stg:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "coreos.stg.key"
certfile = "coreos.stg.crt"

But still got the same error (also tried coreos-stg, coreosstg, coreos_stg).

I also tried the prod credentials (which have CN = coreos):

amqp_url = "amqps://coreos:@rabbitmq.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/cacert.pem"
keyfile = "coreos.key"
certfile = "coreos.crt"

And also got the same results.

(Also tried the restricted pubsub URL).

That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

Ahh OK, how do we make this happen? I couldn't find a list of RabbitMQ users in the ansible repo. Or does someone with permissions need to run https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/rabbit/user/tasks/main.yml?

@jlebon I assume you're still having this issue? or did you figure out the issue?

That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

Ahh OK, how do we make this happen? I couldn't find a list of RabbitMQ users in the ansible repo. Or does someone with permissions need to run https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/rabbit/user/tasks/main.yml?

Sorry for not responding, Pagure does this thing where it doesn't subscribe me to tickets I respond on or something.

Infrastructure people need to decide where they want to stash that task in the Ansible repo. I could probably add that that in RabbitMQ role or something short-term.

Since I didn't see the user getting created in ansible I added that. So the coreos.stg user exists on the staging broker and the coreos user exists in prod. However, looking at the role it only exists in the "/pubsub" vhost so connecting to the public one will get a permission denied.

Furthermore, users can't create queues or bindings in the private vhost so the user isn't much use without a queue and bindings getting created in ansible, too. If you want that I can add that, although I've been advocating for restrictions to get lifted, maybe we can get that sorted out at flock later this week.

Since I didn't see the user getting created in ansible I added that.

Awesome, thanks!

However, looking at the role it only exists in the "/pubsub" vhost so connecting to the public one will get a permission denied.

Gotcha, I think that should be fine.

Furthermore, users can't create queues or bindings in the private vhost so the user isn't much use without a queue and bindings getting created in ansible, too.

OK right, I think that's what I'm hitting now:

[ERROR fedora_messaging.cli] Unable to declare the queue object on the AMQP broker. The broker responded with (403, "ACCESS_REFUSED - access to queue '1ff9c37f-21af-4b9c-a2f7-91002f4e945f' in vhost '/pubsub' refused for user 'coreos'"). Check permissions for your user.

If you want that I can add that, although I've been advocating for restrictions to get lifted, maybe we can get that sorted out at flock later this week.

Has this been discussed during Flock? I'm good with whatever is decided, as long as we can send messages. :)

it might be nice for these messages to be standardized under CI messages spec. That's primarily for test system messages, but it does seem that a start has been made on standardized build messages too (see e.g. the product-build.build.* messages), since about five months back.

Ugh, Pagure apparently just ignored the email response I sent.

I asked a bit at flock and people seemed open to it, but unfortunately I
didn't get a chance to sit down and actually just push the change. When
I get back from traveling/PTO on Thursday I can send a patch to change
it and see if there are any last-minute objections, or you can bug
@abompard to take care of it before that if he's available.

it might be nice for these messages to be standardized under CI messages spec.

Ahh, thanks for the link. So, the final topics and messages we agreed on upstream is in https://github.com/coreos/fedora-coreos-tracker/issues/198#issuecomment-513944390. I think we could change the org.fedoraproject.prod.coreos.build.status.change messages to the product-build.build.* standard, though the downside is that it's then different in topic and body from the other CoreOS-related request ones. I compare our pipeline CI status messages more to Pungi's compose.status.change (or are you planning to move Pungi to product-build.build.* as well?).

I asked a bit at flock and people seemed open to it, but unfortunately I
didn't get a chance to sit down and actually just push the change. When
I get back from traveling/PTO on Thursday I can send a patch to change
it and see if there are any last-minute objections

Heh, no worries. Just got back from PTO myself. Did you get a chance to get this sorted out yet?

@jlebon not sure if anyone's looked at changing Pungi, @mvadkert or @ralph may know. It's trickier with systems that have pre-existing messages because you don't want to break existing consumers, of course, but for new systems it may be worth following a standard.

Our guidance was for newly created systems (or systems being reimplemented) in release pipeline it would be nice to start with standard messages. Seems I come here too late to the game for coreos though :( As for existing systems already connected to message bus, we are not considering to migrate them to a standard, as the benefit would be low.

Login to comment on this ticket.

Metadata