DB_DEADLOCK can happen quite frequently this is why transactions are started with DB_TXN_NOWAIT and deadlock is managed (retry) by backend code.
When there are too many retries (49), the operation fails. The problem is that when the operation fails (because of too many deadlocks) it is difficult to identify the threads that triggered the deadlock and propose workarounds.
For debug reason it could be interesting to have a new configuration option 'nsslapd-db-txn-begin-flags' (in 'cn=config,cn=ldbm database,cn=plugins,cn=config') to call txn_begin. By default the option is DB_TXN_NOWAIT
Linked to Bugzilla bug: https://bugzilla.redhat.com/show_bug.cgi?id=1417338
Metadata Update from @nhosoi: - Issue set to the milestone: 1.3.6.0
Metadata Update from @tbordaz: - Issue assigned to tbordaz
Here is the patch for this ticket <img alt="0001-Ticket-49076-To-debug-DB_DEADLOCK-condition-allow-to.patch" src="/389-ds-base/issue/raw/files/a078dfacb0f0c623412d357dd6e29b1c9987845416dd3fd5d1721fa089dd00ba-0001-Ticket-49076-To-debug-DB_DEADLOCK-condition-allow-to.patch" />
Metadata Update from @tbordaz: - Issue close_status updated to: None
Metadata Update from @tbordaz: - Custom field reviewstatus adjusted to review
There is a topology.standalone.config.set('attr', 'value') you can use instead of the MOD_REPLACE in your test if that's easier.
Perhaps this feature should only be enabled in DEBUG mode to prevent accidents where an admin may set this value. "nowait" implies "performance", so they may tweak this and break something.
Hi William,
Thanks for reviewing that patch. My understanding of topology.standalone.config.set is that it sets attribute in 'cn=config' entry. In this testcase it is updating a database config attribute ('cn=config,cn=ldbm database,cn=plugins,cn=config') and it should not work.
This "feature" is not for DEBUG build, it is for normal delivery. If supports identify flow of DB_RETY/DB_DEADLOCK, it could be interesting to take samples of pstacks when deadlock occurs. So that we would know which components are responsible of them and if we can improve them.
I see williams point about being misleading. You suggest to use this config option to use the default, and the default in BDB is DB_TXN_WAIT, so maybe you should name the config param "nsslapd-db-transaction-nowait" and have it off by default.
sorry, meant nsslapd-db-transaction-wait
Long pending patch, sorry for the delay.
<img alt="0002-Ticket-49076-To-debug-DB_DEADLOCK-condition-allow-to.patch" src="/389-ds-base/issue/raw/files/ff9c202330b663f7b8b34de00a9669b9310d628588781901983bfd1f47a63980-0002-Ticket-49076-To-debug-DB_DEADLOCK-condition-allow-to.patch" />
I like this better, it certainly has a negative connotation so I think an admin won't play with it.
Metadata Update from @firstyear: - Custom field reviewstatus adjusted to ack (was: review)
git push origin master Counting objects: 14, done. Delta compression using up to 8 threads. Compressing objects: 100% (14/14), done. Writing objects: 100% (14/14), 2.89 KiB | 0 bytes/s, done. Total 14 (delta 11), reused 0 (delta 0) To ssh://git@pagure.io/389-ds-base.git b2f76ab..1179c07 master -> master
Metadata Update from @tbordaz: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2135
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Login to comment on this ticket.