#5854 Problems with Koji staging sync
Closed: Fixed 6 years ago Opened 7 years ago by mizdebsk.

Yesterday I've been fixing staging Koji so that incomplete builds
that in the meantime (after sync) were fixed in production could
be re-imported in staging.

Below I described some problems I found and proposed solutions.
I think that sync script should be improved to avoid such problems
in future - I can provide patch if my solutions seem acceptable.

Problem nr 1: incomplete builds are on prod volume

Incomplete (failed, cancelled or deleted) builds are still on prod
volume. When trying to rebuild or import them, koji will try to write
them on prod volume (and obviously fail because of read-only
filesystem).

Proposed solution: don't set volume to prod for incomplete builds,
leave them on default (staging) volume, with no files in filesystem.

Problem nr 2: incomplete builds have rpms in DB

Incomplete builds can have RPMs in DB. This can happen when
production pg_dump is taken when a build was about to complete - it
already had some RPMs attached, but was still building. Sync script
marks such builds as failed, but it doesn't remove files from DB.
Trying to import such builds later fails with constraint violations
(duplicate RPM NVRA in DB).

Proposed solution: delete RPMs of incomplete builds from DB - they
shouldn't be referenced anywhere (eg. not used in buildroots), so this
should be safe.

Problem nr 3: stale files in /mnt/fedora_koji

This is quite simple IMHO. All old files in /mnt/fedora_koji must
go away during sync. Otherwise problems will be inevitable.


Yep. I agree with your issues descriptions and the proposed fixes.

+1 to push your changes.

For problem 3 I think we just have the playbook nuke all the content in there.

Also, we ran into problems with BDR when we last ran things. It wasn't able to handle the full koji db load (one side ran out of memory). So, I think we need to have the playbook stop postgres on both replicas, remove the data dir, initdb and load on 01, then enable replication again and have 02 sync.

We are about to do another sync very soon... did these changes get pushed?

If not, can you do so now?

I went ahead and applied this. We will see how the next sync goes.

:u5408:

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

6 years ago

Login to comment on this ticket.

Metadata