#7275 provide casync index and object storage files for images on getfedora.org
Closed: Can't Fix 4 years ago by syeghiay. Opened 6 years ago by sedrubal.

casync is like a new and better rsync for directory trees, binary files and system images. It can speed up downloads if the user already has a similar image on his local machine.

For example if a user already has fedora 26 and he wants to download fedora 27 he can use fedora 26 as local seed and download only differences using casync. The same applies if he has fedora workstation and he wants to download fedora server.

It was great if you provide a .caibx index file for each image and a (global) .castr object storage on the fedora ftp mirrors.


this would be definitely nice!

@lsedlar @adrian @kevin @puiterwijk we should look at how we would integrate this into the compose and mirror pushing processes. I am guessing we would want to make the .caibx files when doing a compose but likely we want to update/generate the .castr file when updating the filelist files. We probably also want to look at if we have one in each rsync module with its contents. I think it is worth investigating how we could best do it.

I'm not sure this will really be a win for us... it's going to mean a ton of small files on mirrors, which they aren't too happy with. Perhaps we should have someone interested do a proof of concept for 26->27 and show how many files and how much improvement there is?

You can configure the minimum, average and maximum chunk size that should be generated by casync.

I played a bit around with casync (but with the default settings). Here are my results:

During generation of index files and chunk stores, casync utilizes 1 CPU core @ 100%, but it does almost need no RAM. Tests were made on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz.

Input: 1,6G Fedora-KDE-Live-x86_64-27-1.6.iso
Time:
real 0m24,140
user 0m20,94s
system 0m2,80s
Output:
1,0M Fedora-KDE-Live-x86_64-27-1.6.iso.caibx
1,6G Fedora-KDE-Live-x86_64-27-1.6.iso.castr/
- 24589 Files

Input: 1,5G Fedora-Workstation-Live-x86_64-26-1.5.iso
Time:
real 0m23,061s
user 0m20,226s
sys 0m2,515s
Output:
1,2M Fedora-Server-dvd-x86_64-26-1.5.iso.caibx
1,5G Fedora-Server-dvd-x86_64-26-1.5.iso.castr/
- 23285 Files

Input: 1,6G Fedora-Workstation-Live-x86_64-27-1.6.iso
Time:
real 0m25,082s
user 0m21,465s
sys 0m2,735s
Output:
964K Fedora-Workstation-Live-x86_64-27-1.6.iso.caibx
1,6G Fedora-Workstation-Live-x86_64-27-1.6.iso.castr/
- 24165 Files

Input: 2,3G Fedora-Server-dvd-x86_64-26-1.5.iso
Time:
real 0m35,790s
user 0m31,331s
sys 0m3,366s
Output:
926K Fedora-Workstation-Live-x86_64-26-1.5.iso.caibx
1,7G Fedora-Workstation-Live-x86_64-26-1.5.iso.castr/
- 26995 Files

Input: 2,4G Fedora-Server-dvd-x86_64-27-1.6.iso
Time:
real 0m38,876s
user 0m32,772s
sys 0m4,443s
Output:
1,5M Fedora-Server-dvd-x86_64-27-1.6.iso.caibx
2,3G Fedora-Server-dvd-x86_64-27-1.6.iso.castr/
- 36560 Files

Input: 3,0G Fedora-Workstation-Live-x86_64-{26,27}-*.iso
Time:
real 0m46,051s
user 0m40,34s
system 0m5,04s
Output:
1,9M workstations.caidx
2,9G workstations.castr/
- 44976 Files

Input: 3,1G Fedora-{Workstation,KDE}-Live-x86_64-27-1.6.iso
Time:
real 0m48,530s
user 0m41,82s
system 0m5,44s
Output:
2,0M workstations-27.caidx
2,9G workstations-27.castr/
- 44697 Files

Input: 4,1G Fedora-Server-dvd-x86_64-{26,27}-*.iso
Time:
user 1m3,82s
system 0m7,34s
real 1m12,64
Output:
2,7M server.caidx
3,8G server.castr/
- 60892 Files

Input: 3,2G Fedora-{Server-dvd,Workstation-Live}-x86_64-26-1.5.iso
Time:
real 0m58,791
user 0m51,19s
system 0m6,20s
Output:
2,1M 26.caidx
3,2G 26.castr/
- 50208 Files

Input: 3,9G Fedora-{Server-dvd,Workstation-Live}-x86_64-27-1.6.iso
Time:
real 1m01,41s
user 0m53,57s
system 0m6,94s
Output:
2,5M 27.caidx
3,7G 27.castr/
- 58849 Files

Input: 7,0G Fedora-{Server-dvd,Workstation-Live}-x86_64-{26,27}-*.iso
Time:
user 1m36,03s
system 0m9,13s
real 1m59,49s
Output:
4,6M all.caidx
6,5G all.castr/
- 103878 Files
- PDF: https://imgur.com/a/MKWSp
- Average: 65KB
- Median: 50KB
- Maximum file size: 262KB
- Minimum file size: 34B

After reviewing the results, I think one can draw few meaningful conclusions from the experiment.
I'll try to make a new experiment with different chunk sizes and then count the chunks that have to be transmitted if the client has already downloaded for example the Fedora 26 server image and wants to download the Fedora 27 image...

I'm sorry. When I played with the the chunk size it happened that I had chosen a too small chunk size which resulted in numerous tiny files. Unfortunately this killed my BTRFS and I spent some time to fix it again and since then I didn't spent time to do further testing.

But as @kevin said, I can imagine that this possibly may be a problem:

I'm not sure this will really be a win for us... it's going to mean a ton of small files on mirrors, which they aren't too happy with.

I think the releng team has to decide if it is reasonable. I'd still be happy about it, but I can't judge if it's practicable.

Since we don't want a lot of small files, closing as can't fix.

Metadata Update from @syeghiay:
- Issue close_status updated to: Can't Fix
- Issue status updated to: Closed (was: Open)

4 years ago

Login to comment on this ticket.

Metadata