#813 optimize createrepo() function WRT quota
Closed: Fixed 3 years ago by praiskup. Opened 4 years ago by praiskup.

At this point we do this in Worker:

1) build
2) wait for lock
3) createrepo_c

But for large projects like @python/python3.8 is, even though builds are pretty fast -> there are several Worker()s standing in queue for step 2).

First, if the len(queue) >= 1 already -> there's no point to plan another createrepo_c run for another build again; if two createrepo_c tasks are in queue, it means that both the RPM builds are finished, so the first createrepo run covers both builds, and the second is entirely redundant.

The second related issue is that we account to user Worker() instances, not VMs. That means that even though Worker() is waiting for/doing createrepo_c, and VM can be (or is) released -- we are not allowed to start another Worker (and actually re-use allocated VM).
We should really release the VM ASAP, and allow the user acquiring the VM even before the previous build finished.


A builder should skip step 3 if there are more builds in the queue for the same repo. In other words, would say that, for a given batch of builds in a repo, only the last one should trigger createrepo.

After yesterday's release the situation has changed. Mostly by
commit 4969908 and several
follow-up fixes.

.. for large projects like @python/python3.8 is, even though builds are
pretty fast -> there are several Worker()s standing in queue for step
2).

This shouldn't happen now, re-generating repository should be pretty fast now because
we have fixed createrepo_c (optimized caches) and also we only add the recently built
packages on top of old metadata. In turn, createrepo takes few seconds (10-20s) for very
large repositories like iucar/cran (this is mostly caused by processing/creating the xml
files, and we might see speedup after #1171).

First, if the len(queue) >= 1 already -> there's no point to plan
another createrepo_c run for another build again; if two createrepo_c
tasks are in queue, it means that both the RPM builds are finished, so
the first createrepo run covers both builds, and the second is entirely
redundant.

This is not truth anymore. Each createrepo_c run is mandatory, because
each run only adds a small set of new RPMs, for particular build.
Skipping some createrepo run would then lead to missing RPMs in metadata.

The second related issue is that we account to user Worker() instances,
not VMs. That means that even though Worker() is waiting for/doing
createrepo_c, and VM can be (or is) released -- we are not allowed to
start another Worker (and actually re-use allocated VM). We should
really release the VM ASAP, and allow the user acquiring the VM even
before the previous build finished.

This is still an issue. We still should release the VM as soon as
possible. That's why I'm not closing this.

I still have to test fedora-rawhide, but it indeed looks promising.

It seems that builds in fedora-30-x86_64 take substantially and consistently longer than in fedora-31-x86_64 for iucar/cran. A couple of recent examples: https://copr.fedorainfracloud.org/coprs/iucar/cran/build/1153388/, https://copr.fedorainfracloud.org/coprs/iucar/cran/build/1153384/

Any idea why?

After examining more builds, not so consistently after all. It happened at the beginning of the current batch, but now all builds are fast (< 10 min).

It you are right, createrepo still runs longer than I expected (I expected several seconds after performance testing, but it actually is about one and half minute...). From quick checking it looks like CPU is the bottleneck.

All the CPUs are utilized ... load is 1.5x number of CPUs. So maybe it is just fine.... I'd wait for #1171 before starting with further optimization attempts.

Login to comment on this ticket.

Metadata
Attachments 1
Related Pull Requests
  • #1416 Merged 3 years ago