#227 Calculate %autorelease/%autochangelog from shallow clones
Opened 3 years ago by praiskup. Modified 2 years ago

It is not eco* to require full/deep clones of package sources, and for
packages with frequent updates it wouldn't scale long-term.

It also causes troubles in Copr buildsystem, see #199.

Reproducer:

$ git clone https://src.fedoraproject.org/rpms/git-cola.git --depth 100
Cloning into 'git-cola'...
remote: Enumerating objects: 384, done.
remote: Counting objects: 100% (384/384), done.
remote: Compressing objects: 100% (378/378), done.
remote: Total 384 (delta 147), reused 13 (delta 4), pack-reused 0
Receiving objects: 100% (384/384), 993.61 KiB | 1.16 MiB/s, done.
Resolving deltas: 100% (147/147), done.
$ cd git-cola/
$ fedpkg srpm 
Downloading git-cola-3.10.1.tar.gz
######################################################################## 100.0%
Could not execute srpm: 'object not found - no match for id (0f7eba21c56ed3a7ba1e22987c3880d312deaf43)'

Note that the --depth 100 should really be enough.. The migration to
%autorelease was done like 5 commits back.


It is not eco* to require full/deep clones of package sources, and for
packages with frequent updates it wouldn't scale long-term.

I'm not sure this analysis is complete. Are you concerned about disk usage or commit numbers? In your case, a full-clone comes out at the same size.

dist-git repos typically are small in size, especially the object store. The check-out size (work-tree) is independent of the shallowness.

Reproducer:

```
$ git clone https://src.fedoraproject.org/rpms/git-cola.git --depth 100
Cloning into 'git-cola'...
remote: Enumerating objects: 384, done.
remote: Counting objects: 100% (384/384), done.
remote: Compressing objects: 100% (378/378), done.
remote: Total 384 (delta 147), reused 13 (delta 4), pack-reused 0
Receiving objects: 100% (384/384), 993.61 KiB | 1.16 MiB/s, done.
Resolving deltas: 100% (147/147), done.
$ cd git-cola/
$ fedpkg srpm
Downloading git-cola-3.10.1.tar.gz

################################################################## 100.0%

Could not execute srpm: 'object not found - no match for id (0f7eba21c56ed3a7ba1e22987c3880d312deaf43)'
```

That is bad, of course: either pygit2 or rpmautospec do not cope with shallow repos. In fact, git grafts those "shallow roots" (made parent-less due to the shallow clone) as parent-less root commits, which is why git log works. Apparantly, rpmautospec asks pygit2 for the original commit object instead, which has a parent whose object has not been cloned.

Note that the --depth 100 should really be enough.. The migration to
%autorelease was done like 5 commits back.

That might be the case here.

rpmautospec always walks back all the way to the root (even "manually", not using git). As far as I understand, autpchangelog can be turned on and off again, so that "all the way back" is the only safe and complete choice.

What autospec could do: Deal with shallow clones and output a warning (maybe even into the generated changelog) that history was truncated at $refname. That would be good enough for copr, too.

I'm not sure this analysis is complete. Are you concerned about disk
usage or commit numbers? In your case, a full-clone comes out at the
same size.

I didn't want to concentrate on that particular case. This was meant to
be a general issue report that needs to be fixed long-term.

Otherwise I care about all aspects (storage, network bandwidth, client
server CPU, etc.).

dist-git repos typically are small in size, especially the object store.

They are small till they grow :-) and it very much depends on the definition
of "small".

Turns out there is a years old issue in libgit2 which keeps pygit2 from dealing with this gracefully:

https://github.com/libgit2/pygit2/issues/993

But rpmautospec could still do better: accessing the parents attribute of the pygit2 object for the relevant commit throws a KeyError, and snice there may be good (shallow clone) or bad (corrupt object db) reasons rpmautospec could just abort the walk but keep a warning, rather than bailing out completely. Or check .git/shallow ...

Update, after checking more of libgit2's issues: Heck, they don't even support replace refs (which would be an easy workaround) and refused simple solutions years ago already, still going for the big perfect do-it-all aka never-gonna-happen.

So, don't count on libgit2/pygit2 for this but catch it in rpmautospec's traversal instead.

How would you "catch" this in rpmautospec? That would result in both Release number and changelog to be wrong, which are both undesirable, especially for koji builds (the former will likely result in a failed build due to duplicate NVR, and the latter results in lost changelog messages).

Log in to comment on this ticket.

Metadata
Related Pull Requests
  • #283 Closed a year ago