#8899 investigate registry traffic / usage
Closed: Fixed 2 months ago by kevin. Opened 3 months ago by kevin.

The cdn provider we are using for registry.fedoraproject.org says we have used 12TB so far this month, and we were only donated 1.5TB.

So, we need to look at what all the traffic is and if we can reduce it.

And/or perhaps we should just look at moving this traffic off the current cdn.

Metadata Update from @smooge:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: groomed, high-gain, low-trouble

3 months ago

So without looking it at it in details, I suspect 2 things :

1 - podman now defaults to pulling images from registry.fp.o so while podman pull fedora used to take the image from the DockerHub it nows pulls it from registry.fp.o

2 - flatpaks are quite big so that might also be the case for this.

I ll give a quick look at the registry logs to see if there is anything suspicious.

I would suspect that most of this is from flatpaks; the data usage sounds just about what I'd expect here.

@kevin do you think we could move that to the AWS CDN ?

I would like to work on this.
Is it just a case of putting up a Cloudfront distribution for a server in a RedHat datacentre or are the servers already in AWS?
Could we try leverage S3 for storage if its not already there for such large files?

The problem we have found with S3 is that we have several sets of files with + in their name which S3 interprets as something else. This means that any C++ and similar codes can't be downloaded. To change the names would require retooling of other parts which get mirrored to file systems which are ok with + in a file name. [I think there were problems with pushing into S3 also speed wise.]

I think we instead moved to just using cloudfront as a forward proxy of the download servers where it grabs the file and caches it which got rid of the S3 problem (I think).

Ah ok, I've never come accross that + in the filename issue. Good one to note for the future.

So I guess for this then we just need to set up a cloudfront distribution with the current server as the origin and switch the DNS?

We don't have any roles setup to access cloudfront, it needs the master account. Should be easy to set that up, but we will need to adjust our proxies config. Currently they allow past any authed connections (for us to upload images) and only send the rest to the cdn.

@mobrien would you like to do a PR for the changes there? should be in roles/httpd/reverseproxy/templates/reversepassproxy.registry-generic.conf

@kevin just to make sure I understand correctly. You want everything to go to the cdn now, even the authed connections?

No, just keep the same setup we have now with cloudfront instead of cdn77... ie, authenticated stuff still goes to our registery, but otherwise goes to the CDN.

It should just be changing where it detects cdn77 for cloudfront. Then I can make the cloudfront end... unless I should do that before?

So, I looked quickly at this again.

I have created a cloudfront dist for this, but it doesn't seem to be working right.

% podman pull d18n9n1e9qt6fn.cloudfront.net/0ad
Trying to pull d18n9n1e9qt6fn.cloudfront.net/0ad...
manifest unknown: OCI index found, but accept header does not support OCI indexes
Error: unable to pull d18n9n1e9qt6fn.cloudfront.net/0ad: Error initializing source docker://d18n9n1e9qt6fn.cloudfront.net/0ad:latest: Error reading manifest latest in d18n9n1e9qt6fn.cloudfront.net/0ad: manifest unknown: OCI index found, but accept header does not support OCI indexes

Do I need to add something for headers on cloudfront? Or any ideas?

@mobrien @cverna ?

0ad is a flatpak so I don't think we can pull it using podman.

$ podman pull d18n9n1e9qt6fn.cloudfront.net/fedora
Trying to pull d18n9n1e9qt6fn.cloudfront.net/fedora...
Getting image source signatures
Copying blob 1657ffead824 done  
Writing manifest to image destination
Storing signatures

So that seems to work fine.

Ah, is there a way to test the flatpaks?

Anyhow, I guess we just need to change this in dns now...

flatpak install fedora com.play0ad.zeroad

@kalev but will that hit this new cloudfront? or I would have to remote-add it first?

I think you need to remote-add the new url you are testing, yep.

So, I moved cdn.registry.fedoraproject.org to cloudfront. It seems working, but I can't see in podman or flatpak that it's actually getting redirected to that/using it.

I'm sure it's something I missed, but if others could confirm that would be great...

No one has screamed at me that it's down... podman --log-level=debug pull doesn't seem to show it ever hitting the cdn, but I do see traffic on the cdn, so...

I guess it's working? Feel free to re-open if it's not or add a comment here on how I can tell it's hitting the cdn...

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 months ago

Login to comment on this ticket.