Some internal operations may create pblock without setting the pb_conn. https://pagure.io/389-ds-base/issue/49529 returns a failure when pb_conn or pb_op is not set. (see related BZ)
master and RHEL 7.5 389-ds-base-1.3.7.5-12.el7
TBD
Metadata Update from @tbordaz: - Custom field component adjusted to None - Custom field origin adjusted to None - Custom field reviewstatus adjusted to None - Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1534462 - Custom field type adjusted to None - Custom field version adjusted to None
Metadata Update from @tbordaz: - Issue set to the milestone: 1.3.7 backlog
Preliminary patch ..
<img alt="0001-Ticket-49534-Some-coverity-fix-may-prevent-internal-.patch" src="/389-ds-base/issue/raw/files/4279c26c531b2bf8a5898b16d0c38b77c186e54ad7e6a4e454ccbe6508d0d103-0001-Ticket-49534-Some-coverity-fix-may-prevent-internal-.patch" />
Patch looks good, but...
I have been fixing other coverity issues (patches in the works), and we dereference pb_conn in many other places. This is definitely a can of worms, and I can not believe IPA has not hit these dereferences/crashes before.
Can other operation types (add/mods/delete,etc) issued by IPA have a NULL connection? What about "slapi operations", can they be NULL too?
I think there were no crashes or illegal dereferences. There was either
if (pb_conn && !pb_conn->c_needpw &&
or the derference for logging was in !internal_op
I think there were no crashes or illegal dereferences. There was either if (pb_conn && !pb_conn->c_needpw && or the derference for logging was in !internal_op
I think there were no crashes or illegal dereferences. There was either if (pb_conn && !pb_conn->c_needpw &&
It was in the logging, but there are other places in the code where we also dereference pb_conn. So I need to rework those changes and I want to understand how/where pb_conn is correctly NULL. When it is an error and when is it okay? It sounds like it always acceptable for it to be NULL, but what about Slapi Operations, can they also be legitimately NULL?
Actually there are places where pb_conn can not be NULL. Like in a worker thread it definitely gets dereferenced in several places. Like I said I'd love to see the stack trace that leads to the IPA issue so I can see how/where the NULL connection is coming from (as it's not directly coming from a worker thread).
I just want to fix it right, and not play wack-a-mole.
I think it is an internal operation - and there pb_conn is not set
@mreynolds, I came thru the all changes in coverity patch (https://pagure.io/389-ds-base/issue/49529 ) and except op_shared_allow_pw_change change that returns if there is no connection, others changes are good to me
Yeah I just found it in modify_internal_pb()
Great!
I do have more Coventry fixes on the way so I will double check my work for itnernal ops. If you don't mind, I want to hijack your changes and include them in my new coverity patch.
I think the one in connection_threadmain() could alos be too strong
we do slapi_pblock_get (... pb_conn), and check it. but we then go to a switch, where only in one case we need the pb_conn. I think the pblock_get and check should be don in the "case WORK_TO_DO"
@mreynolds, I came thru the all changes in coverity patch (https://pagure.io/389-ds-base/issue/49529 ) and except op_shared_allow_pw_change change that returns if there is no connection, others changes are good to me I think the one in connection_threadmain() could alos be too strong we do slapi_pblock_get (... pb_conn), and check it. but we then go to a switch, where only in one case we need the pb_conn. I think the pblock_get and check should be don in the "case WORK_TO_DO"
I think the one in connection_threadmain() could alos be too strong we do slapi_pblock_get (... pb_conn), and check it. but we then go to a switch, where only in one case we need the pb_conn. I think the pblock_get and check should be don in the "case WORK_TO_DO"
Agreed, I moved it to that switch case.
@lkrispen I agreed that it can be only tested in the appropriate branch of the switch. However considering we are in connection_threadmain shouldn't we systematically have a connection ?
@mreynolds, please feel free to merge the patch in an other ticket. I am trying to create a testcase to reproduce the failure.
But in the reported bug we have [15/Jan/2018:04:18:56.693046487 -0500] - ERR - connection_threadmain - pb_conn is NULL
which is the new check.
At least at shutdown connection_wait_for_new work returns and no conn is set
Yes you are right. Thanks
@mreynolds Please don't use bare int here, use one of the _t types.
Thanks :)
@mreynolds Please don't use bare int here, use one of the _t types. Thanks :)
Where?
I'm about to submit a very large coverity patch so I want to get this in.
Do you mean for needpw? Sure I change it. It's just a flag, we use int for this purpose all over the code base. Are you just trying to "eventually" convert everything over to _t types?
Absolutely I am :)
Works for me ;-)
Ran CI test without any regressions with this patch
<img alt="0001-Ticket-49534-Fix-coverity-issues-and-regression.patch" src="/389-ds-base/issue/raw/files/a650607e3b01a75bb35f39306f09da552be686c155192d08b5dfce645703762d-0001-Ticket-49534-Fix-coverity-issues-and-regression.patch" />
Metadata Update from @mreynolds: - Custom field reviewstatus adjusted to review (was: None)
I see a "long" and an "int" in this patch again ;)
For the most part happy with this I think, but it's the kind of change that CI will help us find more issues :)
I see a "long" and an "int" in this patch again ;) For the most part happy with this I think, but it's the kind of change that CI will help us find more issues :)
Sure I'll change those - sorry i didn't realize I added the other ints/longs. Ack, if I change them?
Yes, ack from me if you do that :)
Thank you so much!
if we want to use correct types, needpw is neither int nor size_t, it is boolean
PRBool is probably a type alias underneath, but if it's PRBool, we should stick to than too.
Just for recording, I failed to create a testcase independent of ipa-pwd-extop. Indeed password extop (1.3.6.1.4.1.4203.1.11.1) implemented in 389-ds does initialize pb_conn when running the internal modification (passwd_apply_mods), so it does not hit this issue.
05907ae..7658232 master -> master
05d5c52..a4ca3df 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
Metadata Update from @mreynolds: - Issue close_status updated to: fixed - Issue status updated to: Closed (was: Open)
This is also introduced a regression in automember plugin. A free() was in the wrong spot. One line fix....
<img alt="0001-Ticket-49534-Fix-coverity-regression.patch" src="/389-ds-base/issue/raw/files/c3bdce765422ffc08741dfd3268fdec6156722a247709120723b2f58ffea9b6a-0001-Ticket-49534-Fix-coverity-regression.patch" />
f8cda081..b3768e6 master -> master
2cabb08..b4fbcf0 389-ds-base-1.3.7 -> 389-ds-base-1.3.7
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/2593
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: fixed)
Login to comment on this ticket.