Learn more about these different git repos.
Other Git URLs
casync is like a new and better rsync for directory trees, binary files and system images. It can speed up downloads if the user already has a similar image on his local machine.
For example if a user already has fedora 26 and he wants to download fedora 27 he can use fedora 26 as local seed and download only differences using casync. The same applies if he has fedora workstation and he wants to download fedora server.
It was great if you provide a .caibx index file for each image and a (global) .castr object storage on the fedora ftp mirrors.
.caibx
.castr
this would be definitely nice!
@lsedlar @adrian @kevin @puiterwijk we should look at how we would integrate this into the compose and mirror pushing processes. I am guessing we would want to make the .caibx files when doing a compose but likely we want to update/generate the .castr file when updating the filelist files. We probably also want to look at if we have one in each rsync module with its contents. I think it is worth investigating how we could best do it.
I'm not sure this will really be a win for us... it's going to mean a ton of small files on mirrors, which they aren't too happy with. Perhaps we should have someone interested do a proof of concept for 26->27 and show how many files and how much improvement there is?
You can configure the minimum, average and maximum chunk size that should be generated by casync.
I played a bit around with casync (but with the default settings). Here are my results:
During generation of index files and chunk stores, casync utilizes 1 CPU core @ 100%, but it does almost need no RAM. Tests were made on a Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz.
Input: 1,6G Fedora-KDE-Live-x86_64-27-1.6.iso Time: real 0m24,140 user 0m20,94s system 0m2,80s Output: 1,0M Fedora-KDE-Live-x86_64-27-1.6.iso.caibx 1,6G Fedora-KDE-Live-x86_64-27-1.6.iso.castr/ - 24589 Files
Input: 1,5G Fedora-Workstation-Live-x86_64-26-1.5.iso Time: real 0m23,061s user 0m20,226s sys 0m2,515s Output: 1,2M Fedora-Server-dvd-x86_64-26-1.5.iso.caibx 1,5G Fedora-Server-dvd-x86_64-26-1.5.iso.castr/ - 23285 Files
Input: 1,6G Fedora-Workstation-Live-x86_64-27-1.6.iso Time: real 0m25,082s user 0m21,465s sys 0m2,735s Output: 964K Fedora-Workstation-Live-x86_64-27-1.6.iso.caibx 1,6G Fedora-Workstation-Live-x86_64-27-1.6.iso.castr/ - 24165 Files
Input: 2,3G Fedora-Server-dvd-x86_64-26-1.5.iso Time: real 0m35,790s user 0m31,331s sys 0m3,366s Output: 926K Fedora-Workstation-Live-x86_64-26-1.5.iso.caibx 1,7G Fedora-Workstation-Live-x86_64-26-1.5.iso.castr/ - 26995 Files
Input: 2,4G Fedora-Server-dvd-x86_64-27-1.6.iso Time: real 0m38,876s user 0m32,772s sys 0m4,443s Output: 1,5M Fedora-Server-dvd-x86_64-27-1.6.iso.caibx 2,3G Fedora-Server-dvd-x86_64-27-1.6.iso.castr/ - 36560 Files
Input: 3,0G Fedora-Workstation-Live-x86_64-{26,27}-*.iso Time: real 0m46,051s user 0m40,34s system 0m5,04s Output: 1,9M workstations.caidx 2,9G workstations.castr/ - 44976 Files
Input: 3,1G Fedora-{Workstation,KDE}-Live-x86_64-27-1.6.iso Time: real 0m48,530s user 0m41,82s system 0m5,44s Output: 2,0M workstations-27.caidx 2,9G workstations-27.castr/ - 44697 Files
Input: 4,1G Fedora-Server-dvd-x86_64-{26,27}-*.iso Time: user 1m3,82s system 0m7,34s real 1m12,64 Output: 2,7M server.caidx 3,8G server.castr/ - 60892 Files
Input: 3,2G Fedora-{Server-dvd,Workstation-Live}-x86_64-26-1.5.iso Time: real 0m58,791 user 0m51,19s system 0m6,20s Output: 2,1M 26.caidx 3,2G 26.castr/ - 50208 Files
Input: 3,9G Fedora-{Server-dvd,Workstation-Live}-x86_64-27-1.6.iso Time: real 1m01,41s user 0m53,57s system 0m6,94s Output: 2,5M 27.caidx 3,7G 27.castr/ - 58849 Files
Input: 7,0G Fedora-{Server-dvd,Workstation-Live}-x86_64-{26,27}-*.iso Time: user 1m36,03s system 0m9,13s real 1m59,49s Output: 4,6M all.caidx 6,5G all.castr/ - 103878 Files - PDF: https://imgur.com/a/MKWSp - Average: 65KB - Median: 50KB - Maximum file size: 262KB - Minimum file size: 34B
After reviewing the results, I think one can draw few meaningful conclusions from the experiment. I'll try to make a new experiment with different chunk sizes and then count the chunks that have to be transmitted if the client has already downloaded for example the Fedora 26 server image and wants to download the Fedora 27 image...
@sedrubal any update?
I'm sorry. When I played with the the chunk size it happened that I had chosen a too small chunk size which resulted in numerous tiny files. Unfortunately this killed my BTRFS and I spent some time to fix it again and since then I didn't spent time to do further testing.
But as @kevin said, I can imagine that this possibly may be a problem:
I'm not sure this will really be a win for us... it's going to mean a ton of small files on mirrors, which they aren't too happy with.
I think the releng team has to decide if it is reasonable. I'd still be happy about it, but I can't judge if it's practicable.
Since we don't want a lot of small files, closing as can't fix.
Metadata Update from @syeghiay: - Issue close_status updated to: Can't Fix - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.