Some IPA updates are expensive in term of processing and #page hit. The likelihood to generate a DS Berkeley DB database deadlock can be high for some common operations.
sample error message:
25/Aug/2020:05:43:32.917438916 -0400] - ERR - NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn - retry (49) the transaction (csn=5f44dd2b000500070000) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [25/Aug/2020:05:43:32.918968080 -0400] - ERR - NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn - Failed to write entry with csn (5f44dd2b000500070000); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [25/Aug/2020:05:43:32.920165783 -0400] - ERR - NSMMReplicationPlugin - write_changelog_and_ruv - Can't add a change for cn=ipausers,cn=groups,cn=accounts,dc=ipadomain,dc=test (uniqid: 43d73aaa-e6b111ea-a193a55b-316987f5, optype: 8) to changelog csn 5f44dd2b000500070000 [25/Aug/2020:05:43:32.927292011 -0400] - ERR - NSMMReplicationPlugin - process_postop - Failed to apply update (5f44dd2b000500070000) error (1). Aborting replication session(conn=91 op=66)
Related: DS upstream ticket: https://pagure.io/389-ds-base/issue/47409 Bz: https://bugzilla.redhat.com/show_bug.cgi?id=979169
When a deadlock is detected one deadlocking thread needs to be rejected to let the other(s) complete. DB_LOCK_YOUNGEST (9) is the DS default: it means the most recent operation fails in favor to the oldest one. DB_LOCK_MINWRITE (6) means the reader(s) are rejected in favor of the writers even if the reader(s) are older.
Proposal: switch the default for FreeIPA to DB_LOCK_MINWRITE for new installs and also existing installs at update time.
This depends on the DS backend redesign (https://pagure.io/389-ds-base/issue/49476) and therefore is valid on 389-DS 1.4.2.3 and higher.
Explanation provided by Thierry Bordaz.
Metadata Update from @fcami: - Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/5071
Also see https://pagure.io/freeipa/issue/8480
Metadata Update from @fcami: - Custom field changelog adjusted to When a 389-DS database (BDB) deadlock is detected, one deadlocking thread needs to be rejected to let the other(s) complete. DB_LOCK_YOUNGEST (9) is the DS default: it means the most recent operation fails in favor to the oldest one. DB_LOCK_MINWRITE (6) means the reader(s) are rejected in favor of the writers even if the reader(s) are older. Switch the default for FreeIPA to DB_LOCK_MINWRITE to give replication and other writing threads a better priority.
Closing as linked PR was merged.
Metadata Update from @antorres: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
Login to comment on this ticket.