Investigations show that setting nsslapd-db-transaction-batch-val does improve performance of concurrent modify operations, but its use violates the durability of the transaction, since the flush of the txn log is delayed a few operations or to when the log_flush_thread runs. To make use of this feature the synchronization between the worker threads and the flush thread needs to be improved.
attachment 0001-snchronize-logflush-thread-with-worker-trheads-confi.patch
just attached the current version I am using for tests
attachment 0001-Ticket-568-make-usage-of-transaction-batch-flush-dur.patch
Hi Ludwig,
The code is looking good to me. I just wonder if in log_flush_threadmain, last_flush being not initialized we may not enforce the interval_flush.
I have question regarding why not separating transaction commit and flush. Here basically the working thread commits the txn, waits for txn flush (or flush immediately), then release the backend, then return the operation result. Couldn't we switch the steps 2 and 3. Once the txn is committed, then release the backend. Here we still have durability because the operation result is not returned. Then we flush (with or without batch), then return the result. The gain would be to release the backend before doing the IO of the txn log.
best regards thierry
yes, last_flush should be initialize, but I think it could only delay the first flush until one of the other conditions is met.
I have left the order of txn_commit, flushing and, dblayer_unlock_backend in this patch, but with ticket 47358, a switch to reverse the order will be provided. I did mention in the description of the patch that there could be a higher benefit if the flushing was moved just before the send_result, but have not tested this.
I did test this patch in conjunction with 47358 and did see a deadlock, I had run a previous version of this fix in a large number of performance tests, maybe I missed somethin when porting, will have to investigate
Looks good to me. Ack'ed.
So, you found an answer to this question? ;)
4475 /* LK this is only needed if online change of 4476 * of txn config is supported ??? 4477 */
Replying to [comment:8 lkrispen]:
The reason was that in dblayer_txn_abort_ext the txn in progress was always decremented independent of use lock. I didn't see this in my previous test as they also had begun teh txn with DB_TXN_WAIT and so there were almost no retries.
checking in original fix:
Updating b4a4ef5..37c531d Fast-forward ldap/servers/slapd/back-ldbm/dblayer.c | 190 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------- ldap/servers/slapd/back-ldbm/ldbm_config.c | 2 ++ ldap/servers/slapd/back-ldbm/ldbm_config.h | 2 ++ ldap/servers/slapd/back-ldbm/proto-back-ldbm.h | 4 ++++ 4 files changed, 171 insertions(+), 27 deletions(-) $ git push origin master Counting objects: 19, done. Delta compression using up to 4 threads. Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 3.83 KiB, done. Total 10 (delta 8), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git b4a4ef5..37c531d master -> master
attachment 0001-fix-hang-related-to-ticket-568.patch
Added a patch for the hang, and committed: Updating 37c531d..e2a5faf Fast-forward ldap/servers/slapd/back-ldbm/dblayer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) $ git push origin master Counting objects: 13, done. Delta compression using up to 4 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 632 bytes, done. Total 7 (delta 5), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 37c531d..e2a5faf master -> master
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1044215
Metadata Update from @lkrispen: - Issue assigned to lkrispen - Issue set to the milestone: 1.3.2 - 05/13 (May)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/568
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.