#47381 nsslapd-db-transaction-batch-val turns to -1
Closed: Fixed None Opened 6 years ago by nkinder.

After a specific sequence of changes the nsslapd-db-transaction-batch-val
attribute of cn=config,cn=ldbm database,cn=plugins,cn=config entry gets the
value of -1 and doesn't change to any other value without stopping the service
and modifying the dse.ldif

Steps to Reproduce:
1. Start 389 DS with the value 0 in dse.ldif (the default one)
2. Modify the value of nsslapd-db-transaction-batch-val. For example switch
nsslapd-db-transaction-batch-val from 0 to 10. using ldapmodify and the
following ldif file:
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-db-transaction-batch-val
nsslapd-db-transaction-batch-val: 10
3. Restart the dirsrv
4. Try to set the value back to 0 using ldapmodify (based on the documentation
this should not be allowed but should break it too). I use the following ldif:
dn: cn=config,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-db-transaction-batch-val
nsslapd-db-transaction-batch-val: 0

Actual results:
nsslapd-db-transaction-batch-val get the value -1 and no further modifications
are done to this value via ldapmodify

Expected results:
I would expect to see either value 0 or 1 (given that this is the closest value
we can get) or get an error on the ldapmodify command and keep the value of 10
and of course to allow other valid modifications.


Although this is not the code modified this time, I'm wondering how this combination among PR_Lock on sync_txn_log_flush, PR_NotifyCondVar on sync_txn_log_do_flush and sync_txn_log_flush_done works... If txn_commit_ext holds PR_Lock, log_flush_threadmain needs to wait for the lock released and cannot do the flush task...? I should be missing something...

{{{
… … int dblayer_txn_commit_ext(struct ldbminfo li, back_txn txn, PRBool use_lock)
3721 PR_Lock(sync_txn_log_flush);
3736 PR_NotifyCondVar(sync_txn_log_do_flush);
3744 PR_WaitCondVar(sync_txn_log_flush_done, PR_INTERVAL_NO_TIMEOUT);
3750 PR_Unlock(sync_txn_log_flush);
… … static int log_flush_threadmain(void *param)
4570 PR_Lock(sync_txn_log_flush);
4596 PR_NotifyAllCondVar(sync_txn_log_flush_done);
4609 PR_WaitCondVar(sync_txn_log_do_flush, interval_wait);
4611 PR_Unlock(sync_txn_log_flush);
}}}
It'd be nice if you could set the LDAP_DEBUG_BACKLDBM log level and attach the error log to this ticket... Thanks!

Replying to [comment:4 nhosoi]:

Although this is not the code modified this time, I'm wondering how this combination among PR_Lock on sync_txn_log_flush, PR_NotifyCondVar on sync_txn_log_do_flush and sync_txn_log_flush_done works... If txn_commit_ext holds PR_Lock, log_flush_threadmain needs to wait for the lock released and cannot do the flush task...?

The lock is released while it's waiting for the condition variable, right? Am I missing something?

I should be missing something...

{{{
… … int dblayer_txn_commit_ext(struct ldbminfo li, back_txn txn, PRBool use_lock)
3721 PR_Lock(sync_txn_log_flush);
3736 PR_NotifyCondVar(sync_txn_log_do_flush);
3744 PR_WaitCondVar(sync_txn_log_flush_done, PR_INTERVAL_NO_TIMEOUT);
3750 PR_Unlock(sync_txn_log_flush);
… … static int log_flush_threadmain(void *param)
4570 PR_Lock(sync_txn_log_flush);
4596 PR_NotifyAllCondVar(sync_txn_log_flush_done);
4609 PR_WaitCondVar(sync_txn_log_do_flush, interval_wait);
4611 PR_Unlock(sync_txn_log_flush);
}}}
It'd be nice if you could set the LDAP_DEBUG_BACKLDBM log level and attach the error log to this ticket... Thanks!

I'm not entirely sure what yolu are looking for but here is the logging(during a series of adds):

[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 0
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (in loop): batchcount: 0, txn_in_progress: 0
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (working): batchcount: 0, txn_in_progress: 0
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 0
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 0
[26/Jun/2013:10:17:23 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 80000065
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (in loop): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (working): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:23 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:23 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 80000065
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 80000065
[26/Jun/2013:10:17:24 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 8000006c
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 8000006c
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 8000006c
[26/Jun/2013:10:17:24 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 8000006d
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 8000006d
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 8000006d
[26/Jun/2013:10:17:24 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 8000006e
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 8000006e
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:24 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 8000006e
[26/Jun/2013:10:17:25 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 8000006f
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 8000006f
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 8000006f
[26/Jun/2013:10:17:25 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 80000070
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (wakeup): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - txn_commit (befor notify): batchcount: 1, txn_in_progress: 1, curr_txn: 80000070
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (in loop): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (working): batchcount: 1, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - log_flush_threadmain (before notify): batchcount: 0, txn_in_progress: 1
[26/Jun/2013:10:17:25 -0400] - txn_commit (before unlock): batchcount: 0, txn_in_progress: 0, curr_txn 80000070
[26/Jun/2013:10:17:25 -0400] - txn_begin: batchcount: 0, txn_in_progress: 1, curr_txn: 80000071

Replying to [comment:5 mreynolds]:

Replying to [comment:4 nhosoi]:

Although this is not the code modified this time, I'm wondering how this combination among PR_Lock on sync_txn_log_flush, PR_NotifyCondVar on sync_txn_log_do_flush and sync_txn_log_flush_done works... If txn_commit_ext holds PR_Lock, log_flush_threadmain needs to wait for the lock released and cannot do the flush task...?

The lock is released while it's waiting for the condition variable, right?
Sure, you are right. Thanks for the logs.

git merge ticket47381
Updating 6e07f4d..690f177
Fast-forward
ldap/servers/slapd/back-ldbm/dblayer.c | 257 ++++++++++++++++++++------------
1 files changed, 158 insertions(+), 99 deletions(-)

git push origin master
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 2.45 KiB, done.
Total 7 (delta 5), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
6e07f4d..690f177 master -> master

commit 690f177

Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.2 - 07/13 (July)

2 years ago

Login to comment on this ticket.

Metadata