Issue #7940: Request for ability to publish messages for CoreOS builds via Fedora Messaging - fedora-infrastructure

fedora-infrastructure

#7940 Request for ability to publish messages for CoreOS builds via Fedora Messaging

Closed: Fixed 4 years ago by kevin. Opened 4 years ago by dustymabe.

In the Fedora CoreOS working group we'd like to tie in to Fedora infra/releng processes to get artifacts signed and also uploaded to certain locations. In order to do this we'd like to leverage the same messaging bus that most Fedora apps/processes use in order to pass information around.

In order to publish these messages we need TLS certs so we have the proper authentication. Our build processes are currently running in CentOS CI so we'd probably need the files to be passed to us directly rather than being created in the ansible private repo.

We don't quite have the exact details of the messages we'd be sending just yet, but they'd need to contain information such that we could get ostree commits and image artifacts signed (see https://pagure.io/fedora-infrastructure/issue/7884).

A proposal for these messages could be something like:

org.fedoraproject.prod.coreos.build.stage.start
- shares information about the COSA build moving through the pipeline
- similar to pungi-compose-phase-start
- should deliver the information in the meta.json from a fedora coreos pipeline run
- can be used to trigger ostree commit signing
- can be used to trigger artifact signing
- can be used to trigger transfer ostree to unified repo
org.fedoraproject.prod.coreos.build.status.change
- shares information about status of the current coreos COSA build in jenkins (i.e. whether it failed or not)
- similar to pungi-compose-status-change
- should deliver the information in the meta.json from a fedora coreos pipeline run

where the body of each message has information about the current state of the build process and the relevant information for signing could be gleaned out. These are just a proposal right now. Some more discussion in https://github.com/coreos/fedora-coreos-tracker/issues/198#issuecomment-505534354

dustymabe commented 4 years ago

@puiterwijk do you know if robosig can inspect information out of a message and ignore it (i.e. we use org.fedoraproject.prod.coreos.build.stage.start messages and robosig ignores if the stage is not the one it cares about) or if it requires a specific topic (i.e. it can't ignore messages in a topic).

pingou commented 4 years ago

I believe CentOS CI already has a TLS cert for fedora-messaging.

@bstinson @siddharthvipul1 do you need another one? From what I remember, CentOS CI is publishing to a central bridge that is the one signing and sending to fedora-messaging/fedmsg. In which case the existing cert is sufficient, no?

pingou commented 4 years ago

@puiterwijk do you know if robosig can inspect information out of a message and ignore it (i.e. we use org.fedoraproject.prod.coreos.build.stage.start messages and robosig ignores if the stage is not the one it cares about) or if it requires a specific topic (i.e. it can't ignore messages in a topic).

Not knowing the inner of robosig but as a general rule about message bus, the more precise the topic, the easier it is on the consumer as it allows to do the filtering of the message at the broker level rather than in the client (sync of it as doing a filtering in a SQL query vs select * and filtering in the code).
Otherwise, it will get all the messages about that topic and will need to check the content of the message to determine if it should act on it or ignore it.

bstinson commented 4 years ago

@pingou, no this is not sufficient. They need their own cert

dustymabe commented 4 years ago

Thanks @bstinson.

He also clarified in IRC that those certs are for fedora CI pipelines specifically and not for infra/release related things.

Who can set us up a cert for coreos messages to be sent to fedora-messaging?

Metadata Update from @dustymabe:
- Issue priority set to: Next Meeting (was: Needs Review)

4 years ago

pingou commented 4 years ago

Who can set us up a cert for coreos messages to be sent to fedora-messaging?

The person who will take on this ticket :)

kevin commented 4 years ago

I can generate this by tomorrow close of business.

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Next Meeting)

4 years ago

kevin commented 4 years ago

ok. I have created two certs. One for staging and one for production. I called them 'coreos' which I hope is ok.

I have placed them under batcave01:~dustymabe/coreos-fedora-messaging-certs/

Let us know if there's anything further you need or I forgot to issue.

:guardsman:

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

4 years ago

dustymabe commented 4 years ago

thanks @kevin

jlebon commented 4 years ago

So, trying to use fedora-messaging --consume using the coreos stg creds, I get:

[ERROR fedora_messaging.cli] The TCP connection appears to have started, but the TLS or AMQP handshake with the broker failed; check your connection and authentication parameters and ensure your user has permission to access the vhost

Config:

amqp_url = "amqps://fedora:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "coreos.stg.key"
certfile = "coreos.stg.crt"

Using the public creds work fine though:

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "/etc/fedora-messaging/fedora.stg-key.pem"
certfile = "/etc/fedora-messaging/fedora.stg-cert.pem"

-->

[INFO fedora_messaging.twisted.protocol] Successfully registered AMQP consumer Consumer(queue=1ff9c37f-21af-4b9c-a2f7-91002f4e945f, callback=<function printer at 0x7ff4a18aad90>)

jcline commented 4 years ago

@jlebon The amqp_url needs to include the username (the value of the CN in the certificate). So if your username is "coreos" you need the URL to be:

"amqps://coreos:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"

You can dump the cert with "openssl x509 -in coreos.stg.crt -text". That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

jlebon commented 4 years ago

Hmm, so I see:

Subject: CN = coreos.stg

So I tried:

amqp_url = "amqps://coreos.stg:@rabbitmq.stg.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/stg-cacert.pem"
keyfile = "coreos.stg.key"
certfile = "coreos.stg.crt"

But still got the same error (also tried coreos-stg, coreosstg, coreos_stg).

I also tried the prod credentials (which have CN = coreos):

amqp_url = "amqps://coreos:@rabbitmq.fedoraproject.org/%2Fpublic_pubsub"
callback = "fedora_messaging.example:printer"

[tls]
ca_cert = "/etc/fedora-messaging/cacert.pem"
keyfile = "coreos.key"
certfile = "coreos.crt"

And also got the same results.

(Also tried the restricted pubsub URL).

That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

Ahh OK, how do we make this happen? I couldn't find a list of RabbitMQ users in the ansible repo. Or does someone with permissions need to run https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/rabbit/user/tasks/main.yml?

dustymabe commented 4 years ago

@jlebon I assume you're still having this issue? or did you figure out the issue?

jcline commented 4 years ago

That user also needs to exist in RabbitMQ so if it's not been created in the infrastructure ansible that also needs to happen.

Ahh OK, how do we make this happen? I couldn't find a list of RabbitMQ users in the ansible repo. Or does someone with permissions need to run https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/rabbit/user/tasks/main.yml?

Sorry for not responding, Pagure does this thing where it doesn't subscribe me to tickets I respond on or something.

Infrastructure people need to decide where they want to stash that task in the Ansible repo. I could probably add that that in RabbitMQ role or something short-term.

jcline commented 4 years ago

Since I didn't see the user getting created in ansible I added that. So the coreos.stg user exists on the staging broker and the coreos user exists in prod. However, looking at the role it only exists in the "/pubsub" vhost so connecting to the public one will get a permission denied.

Furthermore, users can't create queues or bindings in the private vhost so the user isn't much use without a queue and bindings getting created in ansible, too. If you want that I can add that, although I've been advocating for restrictions to get lifted, maybe we can get that sorted out at flock later this week.

jlebon commented 4 years ago

Since I didn't see the user getting created in ansible I added that.

Awesome, thanks!

However, looking at the role it only exists in the "/pubsub" vhost so connecting to the public one will get a permission denied.

Gotcha, I think that should be fine.

Furthermore, users can't create queues or bindings in the private vhost so the user isn't much use without a queue and bindings getting created in ansible, too.

OK right, I think that's what I'm hitting now:

[ERROR fedora_messaging.cli] Unable to declare the queue object on the AMQP broker. The broker responded with (403, "ACCESS_REFUSED - access to queue '1ff9c37f-21af-4b9c-a2f7-91002f4e945f' in vhost '/pubsub' refused for user 'coreos'"). Check permissions for your user.

If you want that I can add that, although I've been advocating for restrictions to get lifted, maybe we can get that sorted out at flock later this week.

Has this been discussed during Flock? I'm good with whatever is decided, as long as we can send messages. :)

adamwill commented 4 years ago

it might be nice for these messages to be standardized under CI messages spec. That's primarily for test system messages, but it does seem that a start has been made on standardized build messages too (see e.g. the product-build.build.* messages), since about five months back.

jcline commented 4 years ago

Ugh, Pagure apparently just ignored the email response I sent.

I asked a bit at flock and people seemed open to it, but unfortunately I
didn't get a chance to sit down and actually just push the change. When
I get back from traveling/PTO on Thursday I can send a patch to change
it and see if there are any last-minute objections, or you can bug
@abompard to take care of it before that if he's available.

jlebon commented 4 years ago

it might be nice for these messages to be standardized under CI messages spec.

Ahh, thanks for the link. So, the final topics and messages we agreed on upstream is in https://github.com/coreos/fedora-coreos-tracker/issues/198#issuecomment-513944390. I think we could change the org.fedoraproject.prod.coreos.build.status.change messages to the product-build.build.* standard, though the downside is that it's then different in topic and body from the other CoreOS-related request ones. I compare our pipeline CI status messages more to Pungi's compose.status.change (or are you planning to move Pungi to product-build.build.* as well?).

jlebon commented 4 years ago

I asked a bit at flock and people seemed open to it, but unfortunately I
didn't get a chance to sit down and actually just push the change. When
I get back from traveling/PTO on Thursday I can send a patch to change
it and see if there are any last-minute objections

Heh, no worries. Just got back from PTO myself. Did you get a chance to get this sorted out yet?

adamwill commented 4 years ago

@jlebon not sure if anyone's looked at changing Pungi, @mvadkert or @ralph may know. It's trickier with systems that have pre-existing messages because you don't want to break existing consumers, of course, but for new systems it may be worth following a standard.

mvadkert commented 4 years ago

Our guidance was for newly created systems (or systems being reimplemented) in release pipeline it would be nice to start with standard messages. Seems I come here too late to the game for coreos though :( As for existing systems already connected to message bus, we are not considering to migrate them to a standard, as the benefit would be low.

jlebon commented 4 years ago

Follow-up to this in #8227.

Metadata

Assignee

kevin

Tags

None

Blocking

None

Depending on

None

Priority

Waiting on Assignee

fedora-infrastructure

Source Code

#7940 Request for ability to publish messages for CoreOS builds via Fedora Messaging Closed: Fixed 4 years ago by kevin. Opened 4 years ago by dustymabe.

Metadata

#7940 Request for ability to publish messages for CoreOS builds via Fedora Messaging

Closed: Fixed 4 years ago by kevin. Opened 4 years ago by dustymabe.