#49305 Improve atomic performance in DS
Closed: fixed 2 years ago Opened 3 years ago by firstyear.

Issue Description

We currently use PR_Atomic operations through out the server. These are equivalent to the SEQ_CST model of atomics in __atomic* ops.

We have already seen a large increase in performance through changing to __atomic types in slapi_counter.

We should make two changes:

First, we should use the correct release/acquire semantics as needed.

Second, we should replace the use of PR_Atomic throughout the server.

This should yield an improvement in thread performance due to reduction in atomic stalls throughout the system.

Package Version and Platform

Steps to reproduce

1.
2.
3.

Actual results

Expected results


Metadata Update from @firstyear:
- Custom field type adjusted to defect
- Issue assigned to firstyear
- Issue priority set to: critical
- Issue set to the milestone: 1.3.7.0
- Issue tagged with: Complex, Performance

3 years ago

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review

3 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)

2 years ago

Thanks @mreynolds ! Next I want to fix the remaining atomics in CSN and related, but I want to talk to @lkrispen first :)

commit 35af6c6
To ssh://git@pagure.io/389-ds-base.git
3abcd68..35af6c6 master -> master

There are platforms on RHEL that still do not support these atomics, this is breaking builds for 7.5. Working on fix...

Metadata Update from @mreynolds:
- Custom field component adjusted to None
- Custom field origin adjusted to None
- Custom field version adjusted to None

2 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to review (was: ack)

2 years ago

Patch look good.
Few remarks:

  • why not using slapi_atomic* in nunc-stans code
  • in slapi_atomic* the is "Futur" comment containing code. Why not using a ifdef ?
  • in the same "Futur" code it is looking there is a wrong cut/paste referencing 'val' that does not exist

Patch look good.
Few remarks:

why not using slapi_atomic* in nunc-stans code

Well nunc-stans is kind of a separate entity. It currently does not use anything from slapd (no slap.h or slapi-plugin.h, etc). I did not look too deeply into it as I was afraid it might be overkill or a can of worms.

in slapi_atomic* the is "Futur" comment containing code. Why not using a ifdef ?

I was just trying to avoid too many #ifdefs :-) Makes code hard to read - not that my comment block is much better. Maybe I will just move it into the function's description comment and out of the function's code block.

in the same "Futur" code it is looking there is a wrong cut/paste referencing 'val' that does not exist

Yeah, you're right. I was really just trying to demonstrate that there is a another level of int sizes that we can potentially use (I did not want it to get forgotten). Right now (int128) it's not available on 32bit platforms, and we also have no need for the 16 bit types at this time.

The patch looks good to me. ACK
Before pushing it you need to change in config_set_max_filter_nest_level the format from ATOMIC_LONG to ATOMIC_INT (it is a slapi_int_t).

683 @@ -6054,7 +6058,7 @@ config_set_max_filter_nest_level(const char *attrname, char *value, char *errorb
684          return retVal;
685      }
686  
687 -    __atomic_store_4(&(slapdFrontendConfig->max_filter_nest_level), level, __ATOMIC_RELEASE);
688 +    slapi_atomic_store(&(slapdFrontendConfig->max_filter_nest_level), &level, __ATOMIC_RELEASE, ATOMIC_INT);
689      return retVal;
690  }

Metadata Update from @tbordaz:
- Custom field reviewstatus adjusted to ack (was: review)

2 years ago

Nice catch Thierry, change made and pushed:

83f04fe..af723fd master -> master

6e8ffe0..4edf9f1 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

This was the revised patch:

0001-Ticket-49305-Need-to-wrap-atomic-calls.patch

Yeah, @mreynolds is right not to touch nunc-stans. It's a bit of a crazy piece of code. I think your patch is okay, PR_Atomics are just the same as the others but with a SeqCST rather than acq/rel.

I'm not sure how long this will matter for though. We don't plan to keep hpux support for a long time, and I want to get the ball rolling on dropping 32bit support, so I wonder how long this function wrapper will last? As well, some of the COW codeI want to add should remove this, because let's face it, raw atomics require too much knowledge, and even I mess it up.

Anyway, it's a bit late now, but I would rather have seen this have seperately sized functions for 64 / 32 bit atomics, just to avoid the extra if statement. But it's a bit late now I think. :(

Yeah, @mreynolds is right not to touch nunc-stans. It's a bit of a crazy piece of code. I think your patch is okay, PR_Atomics are just the same as the others but with a SeqCST rather than acq/rel.
I'm not sure how long this will matter for though. We don't plan to keep hpux support for a long time, and I want to get the ball rolling on dropping 32bit support, so I wonder how long this function wrapper will last? As well, some of the COW codeI want to add should remove this, because let's face it, raw atomics require too much knowledge, and even I mess it up.
Anyway, it's a bit late now, but I would rather have seen this have seperately sized functions for 64 / 32 bit atomics, just to avoid the extra if statement. But it's a bit late now I think. :(

Well its committed, but it can always be changed, especially this early into the 7.5 cycle.

Note this is not just hpux, but also RHEL (ppc to be specific).

If this doesn't work on PPC, then that's a bug in glibc. the __atomic wrappers are MEANT to fall back to wrapping a mutex around the value inthe case that it's not able to be supported at a cpu level on the platform. So I think that this is a bug in glibc.

For HPUX this is needed, so I'm okay with it in 7.5/1.3.7, but for 1.4.x I'd love to see us remove raw access to atomics - they exist because we don't have better techniques in place. :)

If this doesn't work on PPC, then that's a bug in glibc. the __atomic wrappers are MEANT to fall back to wrapping a mutex around the value inthe case that it's not able to be supported at a cpu level on the platform. So I think that this is a bug in glibc.

It didn't recognize the __atomic_*_8 functions at build time, but... I guess it did for the 4's. So something is off, and the solution for now was to do a wrapper. I was also hoping the wrappers would make it easier to use/understand - I tried adding decent comments in the code for people who aren't familiar with it yet.

For HPUX this is needed,

People are also still building on Solaris as well - this could be useful for them too. I think I've seen people recently having issues with the built-ins as well.

so I'm okay with it in 7.5/1.3.7, but for 1.4.x I'd love to see us remove raw access to atomics - they exist because we don't have better techniques in place. :)

Another option could be on platforms where we know we don't have these we could provide our own symbols for __atomic_* types instead.

At the least, I would prefer this to have a distinct 32 vs 64 call (and to avoid the use of the term "int and long", long means 32 bit, long long is 64. Let's use 32 vs 64 in the function names to be clear about what we are doing. As trivial as it seems, the if statements are an overhead, especially given that we really do know the sizing of the value at compile time, and those cpu cycles add up.

Another option could be on platforms where we know we don't have these we could provide our own symbols for __atomic_* types instead.
At the least, I would prefer this to have a distinct 32 vs 64 call (and to avoid the use of the term "int and long", long means 32 bit, long long is 64. Let's use 32 vs 64 in the function names to be clear about what we are doing.

Actually this is not correct in regards to the __atomic function ptr sizes.

https://gcc.gnu.org/ml/gcc/2016-11/txt6ZlA_JS27i.txt

A "long" uses the "_8" functions, not the "_4" functions. So it's not as clear as 32 vs 64 bit when it comes to the sizeof the pointer type. I'm still okay breaking it up, but its confusing as it is, so I don't want to make it more confusing. To simplify things we should "enforce/emphasize" that int32_t and uint64_t are the only types to be used, then breaking it up into separate functions should be easier to consume/understand by other developers.

I also would be happy if the result would be something I can understand :-)

So, if you want to limit the use of atomics to int32_t and uint64_t, why not define atomic types: _atomic_32 and _atomic_64 and a corresponding set of functions. Any user of atomics should already when declaring a counter or whatever which should be atomic should use the atomic types and not just some type which would be the same size.

I also would be happy if the result would be something I can understand :-)
So, if you want to limit the use of atomics to int32_t and uint64_t, why not define atomic types: _atomic_32 and _atomic_64 and a corresponding set of functions. Any user of atomics should already when declaring a counter or whatever which should be atomic should use the atomic types and not just some type which would be the same size.

I think this patch addresses most of the concerns, thoughts?

0001-Review-latest-refactoring-of-atomics.patch

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to review (was: ack)

2 years ago

I also would be happy if the result would be something I can understand :-)
So, if you want to limit the use of atomics to int32_t and uint64_t, why not define atomic types: _atomic_32 and _atomic_64 and a corresponding set of functions. Any user of atomics should already when declaring a counter or whatever which should be atomic should use the atomic types and not just some type which would be the same size.

Because this is C, and there is no such thing as type safety. :)

I think this patch addresses most of the concerns, thoughts?

Yes, I'm happy with this in this form :) Thanks for taking my comments on board.

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to ack (was: review)

2 years ago

1d158cd..93a2958 master -> master

4edf9f1..e22e008 389-ds-base-1.3.7 -> 389-ds-base-1.3.7

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata