#6801 run hardlink on fedora-secondary
Closed: Fixed 3 years ago by kevin. Opened 6 years ago by bellet.

While I ran a routine hardlink on my mirror, I noticed that many files (more than 100000 files, still running) were not hardlinked locally, so I guess they are not hardlinked on the master too, for example:

  • in fedora-secondary/development/28/{Everything,Workstation,Server} debug for i386 packages
  • in fedora-secondary/development/rawhide/{Cloud,Everything,Server} debug for s390x packages
  • in fedora-secondary/development/rawhide// for noarch packages

This might be a pungi/releng issue more than infrastructure.

@lsedlar is there any hardlinking missing from pungi here?

Pungi only hardlinks the packages from koji packages volume into the compose. If they get synced to another volume, maybe rsync breaks the hardlinks and makes copies?

If the compose was done on a separate volume than the packages are on, they would get copied and no hardlink would be created. I'm not sure if that is possible in Fedora infra.

If it may help, this is a log from what would hardlink do today, so after a complete hardlink already ran two days ago, when I created this issue (I interrupted it before completion, but it gives an idea) :

would-hardlink.txt

We can look at this after freeze.

Partly I think this is updates-testing and updates not being hardlinked, but we can investigate more.

Metadata Update from @kevin:
- Issue tagged with: unfreeze

6 years ago

Metadata Update from @kevin:
- Issue assigned to kevin
- Issue priority set to: Waiting on External

6 years ago

So, this came back on my radar as it was marked unfreeze.

I think the problem here is that our new-updates-sync script isn't including all the directories we would like to hardlink when syncing seperate repos.

It would be great to have someone look into that and fix the script to hardlink everything it should.

Metadata Update from @kevin:
- Assignee reset
- Issue untagged with: unfreeze
- Issue priority set to: Waiting on Assignee (was: Waiting on External)

4 years ago

If it may help, after a recent hardlink -vv, it seems to me that "noarch.rpm" packages need to be hardlinked across all trees for rawhide. For example, package hunspell-te-1.0.0-12.fc31.noarch.rpm from rawhide has been hardlinked 20+10 times, for two binary-different packages. Hardlinking noarch.rpm creates 372504 new links for a total of 494361:

[bellet@mandril linux]$ find fedora -name hunspell-te-1.0.0-12.fc31.noarch.rpm -ls
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Server/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Server/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Server/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Everything/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Everything/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/releases/test/31_Beta/Everything/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Server/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Server/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Server/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Everything/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Everything/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora/linux/development/rawhide/Everything/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Server/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Server/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Server/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Everything/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Everything/aarch64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora/linux/development/31/Everything/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
[bellet@mandril linux]$ find fedora-secondary -name hunspell-te-1.0.0-12.fc31.noarch.rpm -ls
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/releases/test/31_Beta/Server/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/releases/test/31_Beta/Server/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/releases/test/31_Beta/Everything/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/releases/test/31_Beta/Everything/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora-secondary/development/rawhide/Server/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora-secondary/development/rawhide/Server/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora-secondary/development/rawhide/Everything/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 42108831    432 -rw-r--r--  10  bellet   bellet     440363 Aug 15 23:48 fedora-secondary/development/rawhide/Everything/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/development/31/Server/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/development/31/Server/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/development/31/Everything/ppc64le/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
 35393745    432 -rw-r--r--  20  bellet   bellet     440363 Jul 25 18:37 fedora-secondary/development/31/Everything/s390x/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
[bellet@mandril linux]$ md5sum fedora/linux/development/rawhide/Everything/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm fedora/linux/development/31/Server/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm 
3826e3fbba6f92e09ac3e23b59513255  fedora/linux/development/rawhide/Everything/x86_64/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
94c05ab715129ca681f86f8954437dcc  fedora/linux/development/31/Server/armhfp/os/Packages/h/hunspell-te-1.0.0-12.fc31.noarch.rpm
[bellet@mandril linux]$ wc -l /tmp/hardlink.log 
494361 /tmp/hardlink.log
[bellet@mandril linux]$ grep noarch.rpm /tmp/hardlink.log | wc -l
372504

Hardlinking ppc64le.rpm is the second best target (51020 links), and s390x.rpm is the third (50045 links).

Metadata Update from @cverna:
- Issue tagged with: high-gain, medium-trouble

4 years ago

In researching this ticket, it was found that the packages are not hardlinked due to the different landing locations and how composes are made. [Packages are put into the trees for composing and then rsync'd to the final location.]

Alternatives looked at would be hardlinking regularly, but because each compose is a 'new' set of files, all the hardlinks would be broken. Doing a hardlink after a compose also would increase the compose time by several hours and hit the storage system hard.

We are going to try this anyway and see if it is tolerable. If not this will go to CANTFIX with current infrastructure.

We will look at mucking with the pungi-fedora/nightly.sh script to do hardlinks after each copy after F33 is released.

@smooge do you still on this on your radar? Could I help you with it?

I failed to update this after the last time we discussed it.

The thing is, that simply running hardlink over things is useless (in general) because we sync stuff all the time to it... so we need fix the sync scripts to hardlink things when they sync them.

The updates / secondary updates are mostly fine, but aarch64 isn't hardlinked with the rest because we make two calls because we put some aarch64 stuff in one place and others in other places. ;(
The only fix I could think of is to run hardlink after the updates sync... but that could be pretty expensive/slow.

rawhide and branched aren't hardlinked, but again it isn't easy to get them so.

I guess I would say next steps is to look and see if we can do something with updates sync...

Metadata Update from @smooge:
- Issue tagged with: downloads, ops

3 years ago

So, I made some changes to the rawhide/branched sync scripts to hardlink things when they sync out each compose now.

So, looking at pub/fedora/linux/development:

Directories: 1365
Objects: 576850
Regular files: 575485
Comparisons: 30
Would link: 0
Would save: 0

and /pub/fedora-secondary/development:

Directories: 838
Objects: 329505
Regular files: 328667
Comparisons: 1
Would link: 0
Would save: 0

So, I am going to say we have this pretty solved finally. :)

If you see any other places that we should address, can you open a new ticket on them?

Thanks!

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Issue status updated to: Open (was: Closed)

3 years ago

Issue status updated to: Closed (was: Open)
Issue close_status updated to: Fixed

3 years ago

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Done
Attachments 1
Attached 6 years ago View Comment