#8479 DS/BDB: Deadlock during writes
Closed: fixed a year ago by antorres. Opened 3 years ago by fcami.

Some IPA updates are expensive in term of processing and #page hit.
The likelihood to generate a DS Berkeley DB database deadlock can be high
for some common operations.

sample error message:

25/Aug/2020:05:43:32.917438916 -0400] - ERR - NSMMReplicationPlugin -
changelog program - _cl5WriteOperationTxn - retry (49) the transaction
(csn=5f44dd2b000500070000) failed (rc=-30993 (BDB0068
DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock))
[25/Aug/2020:05:43:32.918968080 -0400] - ERR - NSMMReplicationPlugin -
changelog program - _cl5WriteOperationTxn - Failed to write entry with
csn (5f44dd2b000500070000); db error - -30993 BDB0068
DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock
[25/Aug/2020:05:43:32.920165783 -0400] - ERR - NSMMReplicationPlugin -
write_changelog_and_ruv - Can't add a change for
cn=ipausers,cn=groups,cn=accounts,dc=ipadomain,dc=test (uniqid:
43d73aaa-e6b111ea-a193a55b-316987f5, optype: 8) to changelog csn
5f44dd2b000500070000
[25/Aug/2020:05:43:32.927292011 -0400] - ERR - NSMMReplicationPlugin -
process_postop - Failed to apply update (5f44dd2b000500070000) error
(1).  Aborting replication session(conn=91 op=66)

Related:
DS upstream ticket: https://pagure.io/389-ds-base/issue/47409
Bz: https://bugzilla.redhat.com/show_bug.cgi?id=979169

When a deadlock is detected one deadlocking thread needs to be rejected to let the other(s) complete.
DB_LOCK_YOUNGEST (9) is the DS default: it means the most recent operation fails in favor to the oldest one.
DB_LOCK_MINWRITE (6) means the reader(s) are rejected in favor of the writers even if the reader(s) are older.

Proposal: switch the default for FreeIPA to DB_LOCK_MINWRITE for new installs and also existing installs at update time.

This depends on the DS backend redesign (https://pagure.io/389-ds-base/issue/49476) and therefore is valid on 389-DS 1.4.2.3 and higher.

Explanation provided by Thierry Bordaz.


Metadata Update from @fcami:
- Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/5071

3 years ago

Metadata Update from @fcami:
- Custom field changelog adjusted to When a 389-DS database (BDB) deadlock is detected, one deadlocking thread needs to be rejected to let the other(s) complete. DB_LOCK_YOUNGEST (9) is the DS default: it means the most recent operation fails in favor to the oldest one. DB_LOCK_MINWRITE (6) means the reader(s) are rejected in favor of the writers even if the reader(s) are older. Switch the default for FreeIPA to DB_LOCK_MINWRITE to give replication and other writing threads a better priority.

3 years ago

Closing as linked PR was merged.

Metadata Update from @antorres:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

a year ago

Login to comment on this ticket.

Metadata