Error that caused the build to fail:
<class 'psycopg2.extensions.TransactionRollbackError'>: deadlock detected DETAIL: Process 8928 waits for ShareLock on transaction 3416696641; blocked by process 8054. Process 8054 waits for ShareLock on transaction 3416696834; blocked by process 8928. HINT: See server log for query details.
This happened during the i686 build of package "jna" for example:
https://koji.fedoraproject.org/koji/taskinfo?taskID=34770460
Might be a mbs issue... adding @jkaluza and factory2
Metadata Update from @kevin: - Issue assigned to jkaluza - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: factory2, mbs
Nothing MBS can do should be able to cause a database-level deadlock. This is definitely a Koji issue.
@kevin Please unassign @jkaluza . @mikem care to take a look?
Done. Sorry about that...
Metadata Update from @kevin: - Assignee reset - Issue untagged with: factory2, mbs - Issue tagged with: koji
So did anyone see the "server log for query details" ? I obviously am unable to view server logs...
The postgres logs have:
ERROR: deadlock detected DETAIL: Process 8928 waits for ShareLock on transaction 3416696641; blocked by process 8054. Process 8054 waits for ShareLock on transaction 3416696834; blocked by process 8928. Process 8928: INSERT INTO rpminfo (arch, build_id, buildroot_id, buildtime, epoch, external_repo_id, id, name, payloadhash, release, size, version) VALUES ('noarch', NULL, NULL, 1548964101, NULL, 33, 17523895, 'findbugs', '5242e8fb6c9cc4e9cb023bd01e2d549d', '17.fc30', 8430603, '3.0.1') Process 8054: INSERT INTO rpminfo (arch, build_id, buildroot_id, buildtime, epoch, external_repo_id, id, name, payloadhash, release, size, version) VALUES ('noarch', NULL, NULL, 1550078283, NULL, 33, 17523897, 'reflections', 'e65c0823223595b7553900b26c4d7b7d', '8.fc30', 152541, '0.9.10')
Perhaps a missing lock and two things finished right at exactly the same second?
Can I get the following?
The buildArch task doesn't import the rpms it generates (that happens in the parent task later), but it can end up adding external rpm entries (to record the contents of the buildroot when external repos are used). I guess there is potential for deadlock there with the implicit locks from dml operations.
How often does this happen?
May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: 2019-05-10 19:03:34,833 [WARNING] {40102} koji .TaskManager:1304 FAULT: May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Traceback (most recent call last): May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/daemon.py", line 1295, in runTask May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: response = (handler.run(),) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/tasks.py", line 311, in run May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return koji.util.call_with_argcheck(self.handler, self.params, self.opts) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/util.py", line 263, in call_with_argcheck May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return func(*args, **kwargs) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/sbin/kojid", line 1309, in handler May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: broot.build(fn,arch) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/sbin/kojid", line 592, in build May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: self.session.host.updateBuildRootList(self.id,self.getPackageList()) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/__init__.py", line 2055, in __call__ May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return self.__func(self.__name, args, opts) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/__init__.py", line 2546, in _callMethod May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: raise err May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Fault: <Fault 1: "<class 'psycopg2.extensions.TransactionRollbackError'>: deadlock detected May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: DETAIL: Process 8928 waits for ShareLock on transaction 3416696641; blocked by process 8054. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Process 8054 waits for ShareLock on transaction 3416696834; blocked by process 8928. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: HINT: See server log for query details. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: "> May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: WARNING:koji.TaskManager:FAULT: May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Traceback (most recent call last): May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/daemon.py", line 1295, in runTask May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: response = (handler.run(),) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/tasks.py", line 311, in run May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return koji.util.call_with_argcheck(self.handler, self.params, self.opts) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/util.py", line 263, in call_with_argcheck May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return func(*args, **kwargs) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/sbin/kojid", line 1309, in handler May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: broot.build(fn,arch) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/sbin/kojid", line 592, in build May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: self.session.host.updateBuildRootList(self.id,self.getPackageList()) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/__init__.py", line 2055, in __call__ May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: return self.__func(self.__name, args, opts) May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: File "/usr/lib/python2.7/site-packages/koji/__init__.py", line 2546, in _callMethod May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: raise err May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Fault: <Fault 1: "<class 'psycopg2.extensions.TransactionRollbackError'>: deadlock detected May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: DETAIL: Process 8928 waits for ShareLock on transaction 3416696641; blocked by process 8054. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: Process 8054 waits for ShareLock on transaction 3416696834; blocked by process 8928. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: HINT: See server log for query details. May 10 19:03:34 buildhw-07.phx2.fedoraproject.org kojid[1595]: "> May 10 19:03:43 buildhw-07.phx2.fedoraproject.org kojid[1595]: ERROR:root:https://kojipkgs.fedoraproject.org///repos/module-eclipse-2019-03-2920190510184754-508ead3c-build/1160672/x86_64
I'm afraid the hub logs have rotated off already. :(
This is the only case I am aware of of this...
This is reminiscent of https://pagure.io/koji/issue/788, but we put in what I thought was a pretty solid workaround for that.
..... unless your hub is still 1.15, but I can't imagine it is....
Hmm, I see. The savepoint we added doesn't help here because it re-raises if the entry is still missing, which is likely the case because the other transaction isn't finished yet.
I expect it takes a pretty rare race to trigger this. Can someone file this against Koji if they haven't already? Something like the following may help:
https://github.com/mikem23/koji-playground/commits/br-list-deadlock
I wouldn't rush to apply a fix though, unless you've seen this more than once.
Opened https://pagure.io/koji/issue/1450 upstream to track this.
Thanks!
:koala:
Metadata Update from @kevin: - Issue close_status updated to: Upstream - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.