#12210 a few mass rebuild bumps failed to git push - script should retry or error
Opened 2 months ago by petersen. Modified 2 months ago

It looks like bad luck, but a small number of packages failed to git push their bumps in the mass rebuild.
I talked to @jnsamyak in matrix and he told me the list is just:

ghc-hxt-charproperties
ghc-hxt-unicode
ghc-indexed-traversable
ghc-iproute
ghc-language-c99-simple
ghc-koji

Here is a paste of the log he shared: https://paste.centos.org/view/3f244f96

Anyway the main point I feel is that the bump/push script should either retry or error when this git push fails, or at least flag the git failures somehow.

My guess is there was likely some infra instability during the period these closely ordered packages were being git pushed.


Metadata Update from @kevin:
- Issue tagged with: low-gain, low-trouble, ops

2 months ago

Metadata Update from @jnsamyak:
- Issue assigned to jnsamyak

2 months ago

Hey @petersen,

Let's start to dig at logs, let's take ghc-language-c99-simple as an example:

logs

ghc-language-c99-simple failed push: Command '['git', 'push', '--no-verify']' returned non-zero exit status 128.
Checking out ghc-language-c99-simple
Bumping /root/massbuild/ghc-language-c99-simple/ghc-language-c99-simple.spec
Committing changes for ghc-language-c99-simple
Pushing changes for ghc-language-c99-simple
GIT_SSH=/usr/local/bin/relengpush git push --no-verify
Enumerating objects: 1, done.
Counting objects: 100% (1/1), done.
Writing objects: 100% (1/1), 230 bytes | 230.00 KiB/s, done.
Total 1 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Emitting a message to the fedora-messaging message bus.
remote: * Publishing information for 1 commits
remote: Sending to redis to log activity and send commit notification emails
remote: * Publishing information for 1 commits
remote:   - to fedora-message
remote: 2024-07-25 11:47:12,734 [WARNING] pagure.lib.notify: pagure is about to send a message that has no schemas: pagure.git.receive
To ssh://pkgs.fedoraproject.org/rpms/ghc-koji
   4dd5732..331ec38  rawhide -> rawhide

I tried doing it manually on two packages it worked like a charm, it committed I mean; So, it might be a glitch due to some timeout, etc error, but nothing serious it seems

Since we already merged the f41-rebuild tag to the main f41, do you want to build it for f41? or for cosmetics purposes would you like me to push the pending commits to rest of the ghc-packages?

I think it is okay, since only a handful of packages were affected, we can probably just close this.
Thanks for looking at it.

Metadata Update from @petersen:
- Issue close_status updated to: It's all good
- Issue status updated to: Closed (was: Open)

2 months ago

(I mean that I can/will build them myself anyway, as part of a Change.)

Well I guess coming back to my purpose of this ticket:
I am wondering if failures like this will be caught in the future though.

(We have seen cases in the past of larger number of packages not getting bump/pushed&rebuilt, though they do show up in the "needs rebuild" page.)

Metadata Update from @petersen:
- Issue status updated to: Open (was: Closed)

2 months ago

I do think we should have better error handling here. I am not sure how hard it would be to add it... either just retrying on failure or logging the fails so we know to redo them.

I'm changing this to ops to dev since it require us to do the code changes

Metadata Update from @jnsamyak:
- Issue untagged with: low-gain, low-trouble, ops
- Issue tagged with: dev, medium-gain, medium-trouble

2 months ago

Log in to comment on this ticket.

Metadata
Boards 1
Dev Status: Backlog