#509 lock-free access to be->be_suffixlock
Closed: Fixed None Opened 7 years ago by rmeggins.

systemtap analysis shows a lot of contention for this lock - investigate a lock-free way to access this e.g. get the value at the beginning of the operation and store in thread local storage, use hazard pointers/ref counting, others


Since the be_suffix is accessed as part of the plugin API, it is not possible to use local thread storage/hazard pointers. Used read/write lock instead.

The suffixlock is still present in the latest patch. Otherwise, looks good.

Replying to [comment:5 rmeggins]:

The suffixlock is still present in the latest patch. Otherwise, looks good.

Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable.

Ludwig's comment:


And I think be_addsuffix is not safe.

If two threads try to add a suffix, both can get the same current count, the both set the new suffix
be->be_suffix[count]= slapi_sdn_dup(suffix);

and then both increment count. the increment is atomic, but the assignment could be done to the wrong index.

Replying to [comment:6 mreynolds]:

Replying to [comment:5 rmeggins]:

The suffixlock is still present in the latest patch. Otherwise, looks good.

Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable.

Ludwig's comment:


And I think be_addsuffix is not safe.

If two threads try to add a suffix, both can get the same current count, the both set the new suffix
be->be_suffix[count]= slapi_sdn_dup(suffix);

How can both get the same count? In the patch, suffixlock is acquired before count is assigned or incremented. The mutex prevents another thread from accessing suffixcounter.

and then both increment count. the increment is atomic, but the assignment could be done to the wrong index.

Replying to [comment:7 rmeggins]:

Replying to [comment:6 mreynolds]:

Replying to [comment:5 rmeggins]:

The suffixlock is still present in the latest patch. Otherwise, looks good.

Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable.

Ludwig's comment:


And I think be_addsuffix is not safe.

If two threads try to add a suffix, both can get the same current count, the both set the new suffix
be->be_suffix[count]= slapi_sdn_dup(suffix);

How can both get the same count? In the patch, suffixlock is acquired before count is assigned or incremented. The mutex prevents another thread from accessing suffixcounter.
I have since revised the patch, the patch Ludwig reviewed did not have any locking. So I added the locking back to the "add" function.

and then both increment count. the increment is atomic, but the assignment could be done to the wrong index.

git merge ticket509
Merge made by recursive.
ldap/servers/slapd/backend.c | 34 ++++++++++++++++++----------------
ldap/servers/slapd/backend_manager.c | 30 +++++++++++++++---------------
ldap/servers/slapd/slap.h | 2 +-
3 files changed, 34 insertions(+), 32 deletions(-)

git push origin master
Counting objects: 24, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (13/13), 1.72 KiB, done.
Total 13 (delta 9), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
4e9aab8..cd6a6dd master -> master

Why was this a merge commit? We should avoid merge commits if at all possible.

Replying to [comment:11 rmeggins]:

Why was this a merge commit? We should avoid merge commits if at all possible.

Sorry I forgot to rebase before I did the merge.

I still think it is not safe. If you don't have the lock in slapi_be_issuffix and traverse be_suffix an other thread in be_addsuffix could have reallocated be_suffix

The last comment also applies to slapi_lookup_instance_name_by_suffix()

There is this comment in backend.c:
{{{
/
* The caller may use the returned pointer without holding the
* be_suffixlock since we never remove suffixes from the array.
* The Slapi_DN pointer will always be valid even though the array
* itself may be changing due to the addition of a suffix.
/
}}}
I think the assumption is that slapi_ch_realloc() in be_addsuffix will alloc be->be_suffix "in place". If this is false, then we must not allow two threads to access be->be_suffix. The traditional way to do this is with a mutex which we already have. If we are trying to get rid of contention on this mutex, we will have to use some lock-free technique like hazard pointers or thread local storage. I think it should be possible to do that, even though the be suffix functions are part of the public API.

I don't know how else we can do this. From what I read, hazard pointers need local thread storage, and since plugins can create their own threads this doesn't seem to be an option.

Should I just revert the fix for now?

I read the comment that caller can use the pointer to slapi_dn without holding the lock since it will never be removed. A realloc will copy the contents of the old alloc to the new, but I think there is no guarantee that the memory can just be extended, so we need the lock when iterating be_suffix.

Replying to [comment:17 mreynolds]:

I don't know how else we can do this. From what I read, hazard pointers need local thread storage, and since plugins can create their own threads this doesn't seem to be an option.

If we need the pointers set up at the time of thread creation, we can force plugin writers to use a slapi thread creation function wrapper, and initialize the thread local data for the hazard pointers in that function. If there are threads not created by this function, then all bets are off.

Or perhaps there is some other lock-free technique for doing this that we have not investigated.

Should I just revert the fix for now?

Or just add back the locking - the slapi counter stuff is ok.

Replying to [comment:18 lkrispen]:

I read the comment that caller can use the pointer to slapi_dn without holding the lock since it will never be removed. A realloc will copy the contents of the old alloc to the new, but I think there is no guarantee that the memory can just be extended, so we need the lock when iterating be_suffix.

Then there is no guarantee that a caller of slapi_be_getsuffix will have it's Slapi_DN changed out from under it, and either the caller will have to hold be->be_suffixlock for the duration of the use of the Slapi_DN* returned by slapi_be_getsuffix, or we will have to find a lock-free way to do this.

Replying to [comment:19 rmeggins]:

Replying to [comment:17 mreynolds]:

I don't know how else we can do this. From what I read, hazard pointers need local thread storage, and since plugins can create their own threads this doesn't seem to be an option.

If we need the pointers set up at the time of thread creation, we can force plugin writers to use a slapi thread creation function wrapper, and initialize the thread local data for the hazard pointers in that function. If there are threads not created by this function, then all bets are off.

Or perhaps there is some other lock-free technique for doing this that we have not investigated.

Should I just revert the fix for now?

Or just add back the locking - the slapi counter stuff is ok.
We don't need the counter is we are using the locks again. I guess I can go back to my original fix of using read/write lockers.

Another option would be to use a linked list for be_suffix instead of reallocating the array after each add.

Replying to [comment:22 mreynolds]:

Another option would be to use a linked list for be_suffix instead of reallocating the array after each add.

Good idea - avoid the realloc

Looks good. Please change roles_cache.c and ldbm_fetch_subtrees() to use slapi_be_getsuffix(be, 0) to get the first element of the suffix list rather than accessing the suffix list directly. We don't want to expose the implementation details outside of backend.c and backend_manager.c

Overall, the patch looks good to me. 3 minor comments...

1) variable i declared at line 180 is not initialized?
{{{
169 173 slapi_be_issuffix( const Slapi_Backend be, const Slapi_DN suffix )
[...]
179 if ( slapi_sdn_compare( be->be_suffix[i], suffix ) == 0)
180 {
181 r= 1;
180 int i, count;
181
182 count = slapi_counter_get_value(be->be_suffixcounter);
183 list = be->be_suffixlist;
184 while(list && i < count){
}}}

2) You may want to update the comment. ;)
{{{
223 /
224 * The caller may use the returned pointer without holding the
225 * be_suffixlock since we never remove suffixes from the array. <==
226 * The Slapi_DN pointer will always be valid even though the array
227 * itself may be changing due to the addition of a suffix.
228
/
229 const Slapi_DN
230 slapi_be_getsuffix(Slapi_Backend
be,int n)
}}}

3) variable sdn is not needed any more?
{{{
229 242 const Slapi_DN *
230 243 slapi_be_getsuffix(Slapi_Backend be,int n)
231 244 {
245 struct suffixlist
list;
232 246 Slapi_DN *sdn = NULL;
}}}

One more... I'm checking the compiler warnings now. Variable "count" at the line 208 is not used any more...

… … be_addsuffix(Slapi_Backend be,const Slapi_DN suffix)
197 204 {
198 205 if (be->be_state != BE_STATE_DELETED)
199 206 {
207 struct suffixlist new_suffix, list;
200 208 int count; <==

git merge ticket509
Updating fc80262..1b6f39c
Fast-forward
ldap/servers/plugins/roles/roles_cache.c | 4 +-
ldap/servers/slapd/back-ldbm/ldif2ldbm.c | 2 +-
ldap/servers/slapd/backend.c | 81 ++++++++++++++++++-----------
ldap/servers/slapd/backend_manager.c | 19 ++++---
ldap/servers/slapd/slap.h | 8 ++-
5 files changed, 71 insertions(+), 43 deletions(-)

[mareynol@localhost ds]$ git push origin master
Counting objects: 25, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (13/13), 1.96 KiB, done.
Total 13 (delta 11), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
fc80262..1b6f39c master -> master

Fix Description:
Previous commit 1b6f39c had a malloc
size problem. Instead of the size of pointer, the size of struct
suffixlist has to be allocated.

{{{
Valgrind output:
==1702== Invalid write of size 8
==1702== at 0x4C5A790: be_addsuffix (backend.c:211)
==1702== by 0x4CD6975: init_schema_dse_ext (schema.c:4163)
==1702== by 0x4CD6A76: init_schema_dse (schema.c:4195)
==1702== by 0x41E26B: setup_internal_backends (fedse.c:1770)
==1702== by 0x42029C: main (main.c:834)
==1702== Address 0x812c3c8 is 0 bytes after a block of size 8 alloc'd
==1702== at 0x4A0881C: malloc (vg_replace_malloc.c:270)
==1702== by 0x4C5D9C7: slapi_ch_malloc (ch_malloc.c:155)
==1702== by 0x4C5A774: be_addsuffix (backend.c:209)
==1702== by 0x4CD6975: init_schema_dse_ext (schema.c:4163)
==1702== by 0x4CD6A76: init_schema_dse (schema.c:4195)
==1702== by 0x41E26B: setup_internal_backends (fedse.c:1770)
==1702== by 0x42029C: main (main.c:834)
}}}

Reviewed by Mark (Thank you!!)

Pushed to master: commit 5bd932e

Pushed to 389-ds-base-1.3.0: commit 69d2dd5

Metadata Update from @lkrispen:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.0

3 years ago

Login to comment on this ticket.

Metadata