#47412 Modify RUV should be serialized in ldbm_back_modify/add
Closed: wontfix None Opened 9 years ago by nhosoi.

An error is caused by 2 separated threads try to update RUV at the same time. One is from replica_replace_ruv_tombstone and the other is from dna in the attached stacktraces. The issue is replica_replace_ruv_tombstone modifies RUV with these operation flags OP_FLAG_REPLICATED | OP_FLAG_REPL_FIXUP | OP_FLAG_REPL_RUV, which makes the backend lock skip. If we don't allow RUV update to ignore the backend lock, my stress test keeps going without the "Retry count exceeded" errors.


stacktraces showing RUV is accessed by 2 threads at the same time
stacktraces.txt

Bug Description: Current ldbm_back_modify allows RUV to update
without respecting other threads in the backend's critical area.
It gives a chance for 2 threads trying to modify RUV at the
same time in the 2 different transactions which causes the DB
deadlocks.

Fix Description: This patch changes the policy for RUV to skip
the backend serial lock.

Is it only modify ops that need to lock for the RUV? Is it possible that an add operation could have the same problem?

Replying to [comment:3 rmeggins]:

Is it only modify ops that need to lock for the RUV? Is it possible that an add operation could have the same problem?

Yeah, I also thought about it and I was not sure if we have a chance to add/delete/modrdn RUV in the disk in the contentious situation... :) I assumed we don't, but do we?

Replying to [comment:4 nhosoi]:

Replying to [comment:3 rmeggins]:

Is it only modify ops that need to lock for the RUV? Is it possible that an add operation could have the same problem?

Yeah, I also thought about it and I was not sure if we have a chance to add/delete/modrdn RUV in the disk in the contentious situation... :) I assumed we don't, but do we?

add - possibly - I'm not sure under what conditions the RUV can be added - delete/modrdn - probably not

Replying to [comment:6 rmeggins]:

Replying to [comment:4 nhosoi]:

Replying to [comment:3 rmeggins]:

Is it only modify ops that need to lock for the RUV? Is it possible that an add operation could have the same problem?

Yeah, I also thought about it and I was not sure if we have a chance to add/delete/modrdn RUV in the disk in the contentious situation... :) I assumed we don't, but do we?

add - possibly - I'm not sure under what conditions the RUV can be added - delete/modrdn - probably not

All right. I'm testing with "add RUV" enabling the backend lock...

git patch file (master) -- take 2: add the same change to ldbm_back_add
0001-Ticket-47412-Modify-RUV-should-be-serialized-in-ldbm.2.patch

Thanks to Rich for his comments. I've attached the second patch reflecting his comments.

Reviewed by Rich (Thank you!!)

Pushed to 389-ds-base-1.2.11: commit bc62f82

Metadata Update from @nhosoi:
- Issue set to the milestone: 1.2.11.22

5 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/750

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

2 years ago

Login to comment on this ticket.

Metadata