#6200 Cannot mirror Fedora drpms using OpenAFS
Closed: Can't Fix 5 years ago Opened 7 years ago by tc01.

I had (mistakenly) filed this against infrastructure:


To summarize: we'd like to run a Fedora mirror in an OpenAFS cell, but the drpms/ directory is far larger than the 64K slot limit in OpenAFS and, as a result, cannot mirror them.

We are currently just excluding the drpms, which works, but I'd like to be able to mirror them if possible.

A couple of possible fixes are mentioned in the discussion on that ticket; keeping fewer drpms, or organizing the drpms into alphabetical subdirectories instead of one giant drpms/ directory.

How does rel-eng feel about this? Is this something that could be fixed in some way or should we merely continue to exclude the drpms?

Thanks in advance.

Fact finding.
Here are the drpms counts from recent rawhide grouped by the first character:
$ for I in {0..9} {A..Z} {a..z} ;
do N=$(ls -1 ${I}* 2>/dev/null| wc -l) ; printf '%s %6d\n' "$I" "${N:-0}" ;
0 2
1 0
2 3
3 20
4 6
5 0
6 1
7 0
8 0
9 3
A 18
B 10
C 77
D 19
E 15
F 26
G 61
H 9
I 39
J 5
K 1
L 34
M 29
N 36
O 85
P 70
Q 15
R 114
S 111
T 11
U 0
V 7
W 12
X 12
Y 2
Z 5
a 1331
b 672
c 1568
d 971
e 815
f 895
g 3605
h 580
i 508
j 921
k 1202
l 4753
m 2174
n 1400
o 1229
p 7103
q 721
r 1590
s 1806
t 948
u 418
v 343
w 397
x 711
y 116
z 197

However the grand total appears to only be:
ls -1 | wc -l

In f22-updates I only found ~6307 drpm files.
Though f21-updates has ~ 26653 drpm files.

Are you using rsync to mirror the files, and if so do you use the options to delete extraneous files at the destination? Regardless I feel grouping the files by the first character to be an acceptable work-around. There really is no solution for OpenAFS because someday in the future the package set may grow to exceed OpenAFS limitations.

It's worth noting that those numbers don't tell the whole story (unfortunately)-- a "slot" is sixteen characters long... and if a filename is more than 16 characters long, it will take up two slots. (And 3 if it's longer than 32, and so on).

Selecting a drpm at random, "kde-runtime-15.04.0-1.fc22_15.04.2-1.fc22.x86_64.drpm" is 53 characters long, meaning it takes up 4 slots.

Yes, this limitation is terrible. :(

It makes sense, we already do this with standard rpms, and having done it already with stanard rpms might even be relatively straight forward to do in mash.

Just another thing to be aware of: if you're doing a full mirror, mirroring into AFS is going to consume significantly more space, because (unless things have changed in the decade since I was doing this!) you can't use hardlinks across directories.

Removing the meeting keyword. this will take someone working on it tochange the tooling or figure out if we can not. createrepo is what makes deltarpms. so both createrepo andcreaterepo_c will need investigation

I just wanted to note that this has implications beyond AFS. Basically, rsync can in some situations run chewing 100% CPU and transferring no data. This only happens on the "drpms" directories and as far as I can is due to the large number of files there and the fact that they are small, so that the rsync thread which transfers data ends up stalling waiting for the main thread to tell it what to transfer. In some cases the connection will stall long enough for the sending side to time out.

For me this significantly increases the amount of time it takes to sync content between my machines. So splitting the "drpms" directories would definitely help. Still doesn't help come up with anyone to actually fix the code, of course.

Metadata Update from @tc01:
- Issue set to the milestone: Fedora 22 Final

5 years ago

we can not not fix the hashing of deltarpms, it would need to be fixed in createrepo and createrepo_c

Metadata Update from @ausil:
- Issue close_status updated to: Can't Fix
- Issue status updated to: Closed (was: Open)

5 years ago

So, should an RFE be filed against createrepo_c?

Login to comment on this ticket.