#48338 SimplePagedResults -- abandon could happen between the abandon check and sending results
Closed: Fixed None Opened 4 years ago by nhosoi.

Bug description:
Problematic code in op_shared_search (opshared.c):
703 pr_search_result = pagedresults_get_search_result(pb->pb_conn, operation, pr_idx);
704 if (pr_search_result) {
705 if (pagedresults_is_abandoned_or_notavailable(pb->pb_conn, pr_idx)) {
706 pagedresults_unlock(pb->pb_conn, pr_idx);
707 / Previous operation was abandoned and the simplepaged object is not in use. /
708 send_ldap_result(pb, 0, NULL, "Simple Paged Results Search abandoned", 0, NULL);
709 rc = LDAP_SUCCESS;
710 goto free_and_return;
711 } else {
712 slapi_pblock_set( pb, SLAPI_SEARCH_RESULT_SET, pr_search_result );
713 rc = send_results_ext (pb, 1, &pnentries, pagesize, &pr_stat);
714
715 / search result could be reset in the backend/dse /
716 slapi_pblock_get(pb, SLAPI_SEARCH_RESULT_SET, &sr);
717 pagedresults_set_search_result(pb->pb, operation, sr, 0, pr_idx);
718 }

The crash occurs in send_results_ext at line 713.
#5 0x00007f5e7711b6a7 in send_results_ext (pb=0x1402660, nentries=0x7f5e4bdef384, pagesize=1000,
pr_stat=0x7f5e4bdef378, send_result=1) at ldap/servers/slapd/opshared.c:1706
#6 0x00007f5e7711c3c0 in op_shared_search (pb=0x1402660, send_result=1) at ldap/servers/slapd/opshared.c:713

where the pr_search_result_set in the simple paged results handle in the connection is already freed (pr_serearch_result_set == 0x0) as well as pr_flags has CONN_FLAG_PAGEDRESULTS_ABANDONE (== 512) bit as follows:
(gdb) p *pb->pb_conn->c_pagedresults->prl_list
$8 = {pr_current_be = 0xe8b460, pr_search_result_set = 0x0, pr_search_result_count = 1000,
pr_search_result_set_size_estimate = 0, pr_sort_result_code = 0, pr_timelimit = 0, pr_flags = 640,
pr_msgid = 5, pr_mutex = 0x7f5db4b12aa0}

but pr_search_result retrieved at line 703 has non-NULL value:
(gdb) p pr_search_result
$9 = (void *) 0x7f5db5e222a0

Also, pagedresults_is_abandoned_or_notavailable does not return CONN_FLAG_PAGEDRESULTS_ABANDONE was set.

That being said, the abandon happened between the line 705 and 713.

Fix description:
The code (from line703 through 717) has to be protected.


Reviewed by Rich (Thank you!!)

Pushed to master:
83c98d9..390b8bd master -> master
commit 390b8bd

Pushed to 389-ds-base-1.3.4:
55434d3..8f49d33 389-ds-base-1.3.4 -> 389-ds-base-1.3.4
commit 8f49d33

Pushed to 389-ds-base-1.3.3:
a4db5ec..a7a8b2d 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit a7a8b2d

Pushed to 389-ds-base-1.2.11:
24a38f2..d5ccc06 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit d5ccc06

git patch file (master) -- fixing a self deadlock introduced by the previous patch 0001-Ticket-48338-SimplePagedResults-abandon-could-happen.patch
0001-Ticket-48338-SimplePagedResults-abandon-could-happen.2.patch

git patch file (master) -- fixing a self deadlock introduced by the previous patch 0001-Ticket-48338-SimplePagedResults-abandon-could-happen.patch (take 2)
0001-Ticket-48338-SimplePagedResults-abandon-could-happen.3.patch

git patch file (master) -- fixing a self deadlock introduced by the previous patch 0001-Ticket-48338-SimplePagedResults?-abandon-could-happen.patch (first phase of take 2)
0001-Ticket-48338-SimplePagedResults-abandon-could-happen.4.patch

Looks good, one minor indentation issue:

ldap/servers/slapd/opshared.c

{{{
diff --git a/ldap/servers/slapd/opshared.c b/ldap/servers/slapd/opshared.c
index 8a14431..9f8773f 100644
680 slapi_pblock_set(pb, SLAPI_CONN_MUTEX_LOCKED, &locked);
681 locked = 0; / for setting unlocked /
}}}

Nathan and Mark, thanks for reviewing the patch [1].

I reworded the last part of the comment as follows based upon the Nathan's review.
{{{
Removing PR_Lock(c_mutex) from slapi_pblock_get(SLAPI_CONN_CLIENTNETADDR) since
acquiring the lock is not necessary for the atomic reads. This change solves
the self deadlock.
}}}
[1] 0001-Ticket-48338-SimplePagedResults-abandon-could-happen.4.patch​.

Reviewed by Nathan and Mark (Thank you!!)

Pushed to master:
341c3c1..79ca67d master -> master
commit 79ca67d

Pushed to 389-ds-base-1.3.4:
c0d35f7..36245ab 389-ds-base-1.3.4 -> 389-ds-base-1.3.4
commit 36245ab

Pushed to 389-ds-base-1.3.3:
7fca878..1d00345 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit 1d00345

Pushed to 389-ds-base-1.2.11:
d5ccc06..0e2a753 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 0e2a753

Note - we are seeing a self deadlock on conn mutex for getting the CONN_DN. Attaching the stack trace showing the self deadlock

Could this change be safe enough to make?
{{{
diff --git a/ldap/servers/slapd/pblock.c b/ldap/servers/slapd/pblock.c
index f2017be..dcb4b31 100644
--- a/ldap/servers/slapd/pblock.c
+++ b/ldap/servers/slapd/pblock.c
@@ -164,6 +164,7 @@ slapi_pblock_get( Slapi_PBlock pblock, int arg, void value )
((PRUint64 )value) = pblock->pb_conn->c_connid;
break;
case SLAPI_CONN_DN:
+ char cdn;
/

* NOTE: we have to make a copy of this that the caller
* is responsible for freeing. otherwise, they would get
@@ -174,10 +175,9 @@ slapi_pblock_get( Slapi_PBlock pblock, int arg, void value )
"Connection is NULL and hence cannot access SLAPI_CONN_DN \n", 0, 0, 0 );
return (-1);
}
- PR_Lock( pblock->pb_conn->c_mutex );
- (*(char )value) = (NULL == pblock->pb_conn->c_dn ? NULL :
- slapi_ch_strdup( pblock->pb_conn->c_dn ));
- PR_Unlock( pblock->pb_conn->c_mutex );
+ / For fields with atomic access, remove the PR_Lock(c_mutex) /
+ cdn = pblock->pb_conn->c_dn;
+ (*(char
)value) = (NULL == cdn ? NULL : slapi_ch_strdup(cdn));
break;
}}}
I'm also worried about other fields in pb_conn such as c_authtype, cin_addr, cin_destaddr, etc...

git patch file (master) -- Revert 0001-Ticket-48338-SimplePagedResults-abandon-could-happen.4.patch
0001-Revert-Ticket-48338-SimplePagedResults-abandon-could.patch

Opened a new ticket for the self deadlock.
Ticket #48406 - Avoid self deadlock by PR_Lock(conn->c_mutex)

Reverted 0001-Ticket-48338-SimplePagedResults?-abandon-could-happen.4.patch

Pushed to master:
dd90d19..f25f804 master -> master
commit 1818478

Pushed to 389-ds-base-1.3.4:
b31b676..84da7d0 389-ds-base-1.3.4 -> 389-ds-base-1.3.4
commit c8b1817

Pushed to 389-ds-base-1.3.3:
e62ef7f..ad08d1f 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit 61c3ff6

Pushed to 389-ds-base-1.2.11:
44413ce..f69d19a 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit bab6ec6

Metadata Update from @nhosoi:
- Issue set to the milestone: 1.2.11.33

2 years ago

Login to comment on this ticket.

Metadata