#7383 Rewrite Koji toplink URLs to improve cache hit ratio
Closed: Fixed 2 years ago Opened 2 years ago by mizdebsk.

Every Koji package is available under multiple URLs. In particular, every build repo has packages under different URLs, like:

Moreover, different tags can share the same package too, for example:

These URLs are pointing to the same file because "toplink" is a symlink pointing to top Koji directrory and the part after /toplink/ is the same. Yet for Varnish these are completely different objects. This leads to cache pollution - the same file can be duplicated in multiple cache objects.

I propose to normalize URLs containing toplink by rewriting them with a pattern like below (copied from Nginx config in a Koji deployment outsides of Fedora infrastructure):

            # normalize toplink for better cacheability
            rewrite ^/repos/[^/]+/[^/]+/[^/]+/toplink/(.*) /$1;

Rewrite should happen before request is proxy-passed to Varnish, or by Varnish itself.

Expected advantages of toplink normalization:

  • Varnish is able to cache more different files (improved hit ratio) -> faster builds
  • reduced phx2-to-rdu2 inter-site traffic (for s390x builds)
  • reduced load on NFS -> faster composes

Sounds fine to me, but perhaps we should ask upstream about changing this to better handle things by default instead of having to re-write?

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

2 years ago

Metadata Update from @mizdebsk:
- Issue assigned to mizdebsk

2 years ago

Implemented in commit 33241e7. This change is now deployed in production.

Metadata Update from @mizdebsk:
- Issue close_status updated to: Fixed
- Issue priority set to: None (was: Waiting on Assignee)
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.