#10512 Outage: copr backend being switched to a Let's Encrypt certificate
Closed: Fixed 2 years ago by praiskup. Opened 2 years ago by praiskup.

I promised @kevin I'll take a look at the automatic bypass of the old
certificate from the old backend server to the new one. But this is
complicated to automatize.

We can not Ansible-synchronize (rsync) from remote host to remote host,
unless the ssh connection is configured between those remote hosts (which
is not desirable).

We could "proxy" through batcave (backup: backend => batcave => backup,
resotre: backup => batcave => backend). But that would require some
secure, perhaps temporary location on the batcave server where we could
store the certificates... is there some location like this?


Metadata Update from @zlopez:
- Issue tagged with: low-gain, low-trouble, ops

2 years ago

We could store these in the ansible-private repo on batcave

The problem we try to do is to automatically backup short-lived letsencrypt
certificates, by our playbook (when they get updated).

Metadata Update from @mohanboddu:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

2 years ago

We could easily create a dir on batcave01 thats on the netapp or otherwise backed up.

Something like /srv/copr-certs/ ? just create in the copr playbook and copy things to it?

That would be nice! I mean, as long as it is not considered too insecure...
I will pay attention to sub-directory ownership and permissions, so only
root can step into them.

Something like /srv/copr-certs/

I would prefer structure like /srv/certbot-certs/copr/{{ main_cert_hostname }}.
Thinking about this ... I just need your ACK here, not an action - because
I can create the directory myself (using the ansible-playbook), right?

Yes, and yes. :)

I don't see any problem with that directory.

The backup/restore process (through the Batcave server) seems to work fine
(tested on staging). I'd like to update production tomorrow:

$ date --date "Thu 2022-02-04 09:00:00 UTC"

I assume the first playbook run (after migrating from the current certificate to
the automated letsencrypt solution) will not work on the first attempt and
that we could face some short outage of the HTTPD server on copr-backend.
So I plan it for about 15 minutes.

Affected services:
Copr backend (package built results, repositories): https://copr-be.cloud.fedoraproject.org/
CDN that mirrors this: https://download.copr.fedorainfracloud.org/

Contact persons:
@praiskup

Can't make it, sorry, moving to later today:

date --date "Thu 2022-02-04 14:30:00 UTC"

Metadata Update from @praiskup:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Done. The certs should be backed-up, usable for our next F35 => F36 (or F37) update.

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog