In the update to NS 0.2.0 the API has changed slightly. We should fix the current NS integration to be able to consume ns0.2.0 to continue to support users who are enabling it.
attachment 0001-Ticket-48996-update-DS-for-ns-0.2.0.patch
Tested with DS and ldclt. Results are:
{{{ Without NS ldclt[14959]: Average rate: 567.57/thr (1702.70/sec), total: 17027 ldclt[14959]: Average rate: 413.67/thr (1241.00/sec), total: 12410 ldclt[14959]: Average rate: 116.70/thr ( 350.10/sec), total: 3501 ldclt[14959]: Average rate: 104.90/thr ( 314.70/sec), total: 3147 ldclt[14959]: Average rate: 116.73/thr ( 350.20/sec), total: 3502 ldclt[14959]: Average rate: 171.60/thr ( 514.80/sec), total: 5148 ldclt[14959]: Average rate: 305.03/thr ( 915.10/sec), total: 9151 ldclt[14959]: Average rate: 391.33/thr (1174.00/sec), total: 11740 ldclt[14959]: Average rate: 227.07/thr ( 681.20/sec), total: 6812 ldclt[14959]: Average rate: 190.77/thr ( 572.30/sec), total: 5723 ldclt[14959]: Number of samples achieved. Bye-bye... ldclt[14959]: All threads are dead - exit. ldclt[14959]: Global average rate: 2605.37/thr (781.61/sec), total: 78161
With NS 0.2.0 ldclt[15114]: Average rate: 720.77/thr (2162.30/sec), total: 21623 ldclt[15114]: Average rate: 309.23/thr ( 927.70/sec), total: 9277 ldclt[15114]: Average rate: 138.47/thr ( 415.40/sec), total: 4154 ldclt[15114]: Average rate: 110.50/thr ( 331.50/sec), total: 3315 ldclt[15114]: Average rate: 98.77/thr ( 296.30/sec), total: 2963 ldclt[15114]: Average rate: 103.03/thr ( 309.10/sec), total: 3091 ldclt[15114]: Average rate: 115.67/thr ( 347.00/sec), total: 3470 ldclt[15114]: Average rate: 282.23/thr ( 846.70/sec), total: 8467 ldclt[15114]: Average rate: 552.80/thr (1658.40/sec), total: 16584 ldclt[15114]: Average rate: 311.60/thr ( 934.80/sec), total: 9348 ldclt[15114]: Number of samples achieved. Bye-bye... ldclt[15114]: All threads are dead - exit. ldclt[15114]: Global average rate: 2743.07/thr (822.92/sec), total: 82292 }}}
A second patch will be added here too for creation of slapi_ch_memalign, to malloc aligned memory.
We don't keep the option without NS?
Also if you discontinue an api, you should clean up all the underlying implementation? libglobs.c:config_get_maxdescriptors(void)
Update to add the memalign functions. 0001-Ticket-48996-update-DS-for-ns-0.2.0.2.patch
Second required patch that removes the ns_job_modify calls. 0001-Ticket-48996-update-DS-for-ns-0.2.0-remove-ns_job_mo.patch
{{{ ldclt[10895]: Average rate: 728.37/thr (2185.10/sec), total: 21851 ldclt[10895]: Average rate: 338.57/thr (1015.70/sec), total: 10157 ldclt[10895]: Average rate: 232.60/thr ( 697.80/sec), total: 6978 ldclt[10895]: Average rate: 171.37/thr ( 514.10/sec), total: 5141 ldclt[10895]: Average rate: 135.10/thr ( 405.30/sec), total: 4053 ldclt[10895]: Average rate: 151.73/thr ( 455.20/sec), total: 4552 ldclt[10895]: Average rate: 597.80/thr (1793.40/sec), total: 17934 ldclt[10895]: Average rate: 364.37/thr (1093.10/sec), total: 10931 ldclt[10895]: Average rate: 215.43/thr ( 646.30/sec), total: 6463 ldclt[10895]: Average rate: 207.90/thr ( 623.70/sec), total: 6237 ldclt[10895]: Number of samples achieved. Bye-bye... ldclt[10895]: All threads are dead - exit. ldclt[10895]: Global average rate: 3143.23/thr (942.97/sec), total: 94297
}}}
Those numbers look good! What exact ldclt command did you use? Number of threads?
In addition to Noriko's comments can you add a small comment to slapi_ch_memalign() stating what it does. We know what it does, but it might throw off others looking at the code.
With 10,000 objects in the DS.
What do you mean?
Ahhh I didn't realise that this was the only place we used it? I'll have a look.
Okay, I will add the comment to the memalign. Thanks for the review.
Merge the two patches into one. Update based on comments. 0001-Ticket-48996-update-DS-for-ns-0.2.0.3.patch
Updated patch based on comments. Merge the two for easier review.
It looks like config_get_maxdescriptors is used in get_configured_connection_table_size, which is used to set the connection table size when nunc-stans is disabled. So I'm going to leave that alone.
I'm including everyone in the CC list. :) I've had a short conversation with Nathan regarding this topic -- how to switch from the current connection table to the Nunc-Stans implementation.
We used to have a configuration switch to support 2 methods in some improvements, which turned out not so useful after all considering the support burden and the codes' readability. So, we do not want to do that here. Instead, can we put the new DS code for NS 0.2.0 into one new file (let me call it new_connection.c temporarily) and switch between the old code (connection.c + conntable.c?) and new_connection.c by a configure option?
Until NS 0.2.0 & new_connection.c are fully stabilized, we should keep the current code available. Otherwise, when debugging some other places and it is affected by the new code, we get easily confused...
This is just a proposal. Your comments would be greatly appreciated.
I want to clear something up.
This ticket does not remove or alter the behaviour of the connection table. That behaviour stays. For discussion of change to the connection table and it's removal, there is bug #48977. I completely agree that when we are working on that ticket we should do as you suggest, and make it in a new .c file in parallel. I think at that time when it is complete, we should take away the nsslapd-enable-nunc-stans: on setting because "today" we have a customer who relies on this setting and it's effect on accept().
Now lets look at the current change.
Currently, nunc-stans 0.1.8 is broken IMO. It's not reliable and has a lot of stability flaws. The work to 0.2.0 has all been focused on correctness and stability. But that change has involved changing the API of nunc-stans. This is why this ticket is altering the current nunc-stans integration code to make it possible to use it as a stop gap until we implement #48977.
In summary:
#48996 - (this ticket) is to use 0.2.0 with the current nunc-stans accept() and connection table code, keep the status quo. #48977 - is to use 0.2.0 as the thread workers, and to remove the connection table. When this is complete, we would #ifdef out the current connection code with a compile flag.
Does that seem like a reasonable explanation and path forwards with this?
Ah, ok. I thought you were trying to get rid of, e.g., config_get_maxdescriptors which is not needed for NS 0.2.0 any more. But sorry, you are right. You are removing just in #ifdef ENABLE_NUNC_STANS.
Since it is really hard to review with lots of indentation changes, I'm attaching a diff withouth it and review based upon the file...
Made from 0001-Ticket-48996-update-DS-for-ns-0.2.0.3.patch​ ignoring space chanages. diffs.txt
Please fix the indentation... {{{ ldap/servers/slapd/backend_manager.c 15 + Slapi_PBlock pb; }}} I think you don't want to remove this memset. {{{ ldap/servers/slapd/pblock.c 264 + / memset( pb, '\0', sizeof(Slapi_PBlock) ); / }}} I think you'd better follow the doxygen format? (yes, I know the other slapi_ch_ functions even do not have comments... :) {{{ ldap/servers/slapd/slapi-plugin.h 276 +/ 277 + * memalign returns an alligned block of memory as a multiple of alignment. 278 + * alignment must be a power of 2. This is not normally needed, but is required 279 + * for memory that works with certain cpu operations. It's basically malloc 280 + * with some extra guarantees. 281 + / }}}
Updates to patch based on Noriko's comments. 0001-Ticket-48996-update-DS-for-ns-0.2.0.4.patch
Just for the ticket, we do want to remove the memset. This is the only place we call pblock_init_common. The memset actually triggers a stack overflow when used. We shouldn't be using memset like this in 2016, as we can allocate the struct off the stack with each loop iteration as empty using:
{{{ Slapi_PBlock pb = {0}; }}}
Which is much cleaner, doesn't cause the overflow, and a correct / modern way to do this.
Replying to [comment:12 firstyear]:
Just for the ticket, we do want to remove the memset. This is the only place we call pblock_init_common. The memset actually triggers a stack overflow when used.
Then we should increase the stack size.
We shouldn't be using memset like this in 2016, as we can allocate the struct off the stack with each loop iteration as empty using: {{{ Slapi_PBlock pb = {0}; }}} Which is much cleaner, doesn't cause the overflow, and a correct / modern way to do this.
We shouldn't be using memset like this in 2016, as we can allocate the struct off the stack with each loop iteration as empty using:
In the past, this wouldn't work. Did C99 change automatic stack variable structure initialization to be able to initialize the entire structure to zeros using a simple {0}, and not {0, 0, ...} for each structure member?
If in both cases we are initializing a pblock structure that is too large on the stack, why does memset fail but variable initialization succeed? Aren't they both accessing past the end of a struct that is too large on the stack?
Replying to [comment:13 rmeggins]:
Replying to [comment:12 firstyear]: Just for the ticket, we do want to remove the memset. This is the only place we call pblock_init_common. The memset actually triggers a stack overflow when used. Then we should increase the stack size.
Stack size doesn't affect it. All my investigation, I could never work out what was really going wrong here, and this was the only reliable fix.
We shouldn't be using memset like this in 2016, as we can allocate the struct off the stack with each loop iteration as empty using: {{{ Slapi_PBlock pb = {0}; }}} Which is much cleaner, doesn't cause the overflow, and a correct / modern way to do this. In the past, this wouldn't work. Did C99 change automatic stack variable structure initialization to be able to initialize the entire structure to zeros using a simple {0}, and not {0, 0, ...} for each structure member?
I believe so. I'm not seeing any compiler warnings for backend_manager.c.
I tried for ages to solve this. I was disassembling the libraries, running under gdb, everything. It didn't make any sense what so ever. But asan would detect that we were over-running the struct causing the overflow when we called this memset. Curiously, only while nunc-stans is enabled (which led to another slew of NS fixes, but still no dice). What makes it weirder is that the threads aren't even part of the nunc-stans worker pools, so I'm just as lost now as to why it's happening.
So I could spend the next week working it out, or I could just supply a fix that is cleaner, more modern and removes a memset.
Replying to [comment:14 firstyear]:
Replying to [comment:13 rmeggins]: Replying to [comment:12 firstyear]: Just for the ticket, we do want to remove the memset. This is the only place we call pblock_init_common. The memset actually triggers a stack overflow when used. Then we should increase the stack size. Stack size doesn't affect it. All my investigation, I could never work out what was really going wrong here, and this was the only reliable fix. We shouldn't be using memset like this in 2016, as we can allocate the struct off the stack with each loop iteration as empty using: {{{ Slapi_PBlock pb = {0}; }}} Which is much cleaner, doesn't cause the overflow, and a correct / modern way to do this. In the past, this wouldn't work. Did C99 change automatic stack variable structure initialization to be able to initialize the entire structure to zeros using a simple {0}, and not {0, 0, ...} for each structure member? I believe so. I'm not seeing any compiler warnings for backend_manager.c.
Would you see compiler warnings if all of the pblock members were not initialized to 0? At any rate, on Fedora 24, a simple test shows that all structure members are initialized to 0, if {{{ = {0} }}} is used or memset is used. It is a trivial example, and perhaps the behavior would change in a large program, but I don't know why that would be the case.
If in both cases we are initializing a pblock structure that is too large on the stack, why does memset fail but variable initialization succeed? Aren't they both accessing past the end of a struct that is too large on the stack? I tried for ages to solve this. I was disassembling the libraries, running under gdb, everything. It didn't make any sense what so ever. But asan would detect that we were over-running the struct causing the overflow when we called this memset.
I tried for ages to solve this. I was disassembling the libraries, running under gdb, everything. It didn't make any sense what so ever. But asan would detect that we were over-running the struct causing the overflow when we called this memset.
Then it would seem to have nothing to do with the stack size. It doesn't make sense unless the size of the pblock structure is somehow different between the two contexts.
Curiously, only while nunc-stans is enabled (which led to another slew of NS fixes, but still no dice). What makes it weirder is that the threads aren't even part of the nunc-stans worker pools, so I'm just as lost now as to why it's happening. So I could spend the next week working it out, or I could just supply a fix that is cleaner, more modern and removes a memset.
Curiously, only while nunc-stans is enabled (which led to another slew of NS fixes, but still no dice). What makes it weirder is that the threads aren't even part of the nunc-stans worker pools, so I'm just as lost now as to why it's happening.
Yes, but I am concerned that we are just masking a real bug that will come back to haunt us later.
In the past, this wouldn't work. Did C99 change automatic stack variable structure initialization to be able to initialize the entire structure to zeros using a simple {0}, and not {0, 0, ...} for each structure member? I believe so. I'm not seeing any compiler warnings for backend_manager.c. Would you see compiler warnings if all of the pblock members were not initialized to 0? At any rate, on Fedora 24, a simple test shows that all structure members are initialized to 0, if {{{ = {0} }}} is used or memset is used. It is a trivial example, and perhaps the behavior would change in a large program, but I don't know why that would be the case.
In the past, this wouldn't work. Did C99 change automatic stack variable structure initialization to be able to initialize the entire structure to zeros using a simple {0}, and not {0, 0, ...} for each structure member? I believe so. I'm not seeing any compiler warnings for backend_manager.c.
Glad that you tested it and got the same result as I did. When I see things like this, I'll probably start replacing them over time. If I had more time, I'd probably do an en-mass cleanup of whitespace and things like this TBH.
If in both cases we are initializing a pblock structure that is too large on the stack, why does memset fail but variable initialization succeed? Aren't they both accessing past the end of a struct that is too large on the stack? I tried for ages to solve this. I was disassembling the libraries, running under gdb, everything. It didn't make any sense what so ever. But asan would detect that we were over-running the struct causing the overflow when we called this memset. Then it would seem to have nothing to do with the stack size. It doesn't make sense unless the size of the pblock structure is somehow different between the two contexts.
Which i was testing for, but couldn't work out. It didn't make any sense what was occurring, because in gdb reading the last field would show an address at 704 bytes with a len of 8 bytes (to make 712) and you could read/write that just fine. It was only the application of the memset that had the issue.
Curiously, only while nunc-stans is enabled (which led to another slew of NS fixes, but still no dice). What makes it weirder is that the threads aren't even part of the nunc-stans worker pools, so I'm just as lost now as to why it's happening. So I could spend the next week working it out, or I could just supply a fix that is cleaner, more modern and removes a memset. Yes, but I am concerned that we are just masking a real bug that will come back to haunt us later.
It could just as easily be an issue with asan, as it could be a real bug. Again, we've satisfied the tools, and the test cases, so I think the question here is if it's worth me investigating. I'm still continually running dynamic analysis tools, so I'll pick up issues pretty fast if they do crop up. I'm trying to be proactive rather than reactive here. I'm not adding "more" issues here, and this change allow us to pass from crash, to passing a large load test. If it was likely to cause us "another issue", surely we would see this issue pass to another overflow later, but that's not what I saw.
Okay, undoing the memset change, but the patch stays as is otherwise.
attachment 0001-Ticket-48996-update-DS-for-ns-0.2.0.5.patch
commit ff9dafa Compressing objects: 100% (14/14), done. Writing objects: 100% (14/14), 3.44 KiB | 0 bytes/s, done. Total 14 (delta 12), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 617b64f..ff9dafa master -> master
attachment 0001-Ticket-48996-Fix-rpm-to-work-with-ns-0.2.0.patch
Once your patch is applied, the build started working again. Thanks.
commit e23f2af Compressing objects: 100% (23/23), done. Writing objects: 100% (23/23), 9.31 KiB | 0 bytes/s, done. Total 23 (delta 20), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git b96806c..5cc301f master -> master
Metadata Update from @firstyear: - Issue assigned to firstyear - Issue set to the milestone: 1.3.6 backlog
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2055
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Log in to comment on this ticket.