#7825 RFR: OpenShift tenant in cluster for Fedora CoreOS release/update pipeline
Closed: Fixed 2 years ago by kevin. Opened 2 years ago by sanja.

The CoreOS team needs access to an OpenShift tenant in the Fedora cluster, including a monitoring system that pings us in case something fails, for the FCOS release and update pipeline. For that, we understand we have to have:

  • a FAS group
  • an Ansible playbook

The timeline would be by 10th of June or before, if possible. I'll let others fill in the rest via comments (feel free to edit the description as well).

Hey @sanja

What do you mean by "OpenShift cluster" ? An actual cluster with a master and multiple nodes ? or something else ?

If this is an actual cluster you want that's is a pretty big request and I don't think the 10th of June is a realistic expectation considering the others initiative we are currently busy with.

Metadata Update from @cverna:
- Issue priority set to: Waiting on Reporter (was: Needs Review)

2 years ago

Yeah, I think the description just needs a s/OpenShift cluster/OpenShift tenant in the existing Fedora OpenShift cluster/.

Updated title and description to reflect that.

I'll sponsor here and shepard this along.

We are going to re-use the sysadmin-coreos group. I'll populate it with new folks later today.

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on Assignee (was: Waiting on Reporter)
- Issue tagged with: OpenShift, request-for-resources

2 years ago

Ok so there is template for RFR which provides most of the info we need and also gives you a check list of what you need to provide --> https://fedora-infra-docs.readthedocs.io/en/latest/sysadmin-guide/sops/requestforresources.html#ticket-comment-template

Our OpenShift apps are currently managed in Ansible you can look here for examples --> https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/playbooks/openshift-apps

and here for the roles --> https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/openshift-apps

Without knowing what type of application you want to run it is a bit difficult to point you to a similar example :-)

Metadata Update from @cverna:
- Assignee reset
- Issue untagged with: OpenShift, request-for-resources
- Issue priority set to: Waiting on Reporter (was: Waiting on Assignee)

2 years ago

Thanks, we'll update this as soon as we get the Ansible playbook together.

I'm adding here a few more details (from template phase II) on what we are going to deploy.

As there is a bootstrapping problem between publishing the initial Fedora CoreOS release and deploying/implementing this service, we plan to approach this by first deploying a stub service (i.e. the one I'm using for testing client logic, https://github.com/lucab/exp-dumnati) and later replace with the final cincinnati code.

I've assembled an ansible playbook to build and deploy it on top of openshift: https://github.com/lucab/fedora-infra-ansible/pull/1/files

I'll be happy to have auth and RBAC in place, in order to provision that and close this ticket.

Metadata Update from @puiterwijk:
- Issue assigned to puiterwijk

2 years ago

Security related followups, after yesterday discussion:
* amended container image to build via system (F30) Rust toolchain
* egress policy: 443/tcp to Internet (HTTPS to builds.coreos.fedoraproject.org)
* data policy: this service does not store nor handle user data. No action required on GDPR requests.
* I have working SSH access to bastion+batcave, and enrolled 2FA/TOTP on my FAS account

Metadata Update from @mizdebsk:
- Issue tagged with: request-for-resources

2 years ago

I think the last piece missing here is registering/whitelisting the public endpoints for the route objects:
- updates.coreos.stg.fedoraproject.org
- updates.coreos.fedoraproject.org

Project is currently deployed (via the playbook above) to staging, once routes are reachable I'll finish kicking the tires there and deploy production too.

@lucab how are things going here? Let us know if you need anything further from us...

All the pending work to deploy to staging is done, thanks @puiterwijk and @kevin!
At this point we are iterating over other pieces of fedora-coreos using the staging endpoint. I think we can close this ticket, and just open a new one when we are ready to move to production.

For reference, these are my upcoming tasks on this topic:
* finalize the backend logic and test it on next fedora-coreos release preview
* write a SOP for the service
* figure out metrics/alerts story
* deploy to production

Cool. Let us know when you are ready for next steps...

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.