Issue #2761: AD provider offline when trusted domain not reachable - sssd

SSSD / sssd

#2761 AD provider offline when trusted domain not reachable

Closed: Duplicate None Opened 8 years ago by vokac.

Our AD has trust with second AD, but posixAccount objects are only in our first AD. When we use "ad" providers all trusted subdomains are automatically detected even thought only first AD is declared in sssd.conf. Because we have no real accounts in second AD we provide only credentials (keytab) that can be used with first AD. Everything works till the LDAP query that can't be answered by first AD (e.q. query to unknown uidNumber) - in such situation sssd tries to contact second (trusted) AD and it fails because it doesn't have valid credential - unfortunately this has one ugly side effect that whole "ad" provider is marked offline (for a minute) and no new query is answered event though first AD is fine and could answer this query.

(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sasl_bind_send] (0x0100): Executing sasl bind mech: gssapi, user: host/client.example.com
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sasl_bind_send] (0x0020): ldap_sasl_bind failed (-2)[Local error]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sasl_bind_send] (0x0080): Extended failure message: [SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server not found in Kerberos database)]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [fo_set_port_status] (0x0100): Marking port 389 of server 'dc1.example.org' as 'not working'
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [ad_user_data_cmp] (0x1000): Comparing LDAP with LDAP
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [fo_set_port_status] (0x0400): Marking port 389 of duplicate server 'dc1.example.org' as 'not working'
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [ad_user_data_cmp] (0x1000): Comparing LDAP with LDAP
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sdap_handle_release] (0x2000): Trace: sh[0xbab700], connected[1], ops[(nil)], ldap[0xb9c6e0], destructor_lock[0], release_memory[0]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [remove_connection_callback] (0x4000): Successfully removed connection callback.
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_mark_offline] (0x2000): Going offline!
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_mark_offline] (0x2000): Initialize check_if_online_ptask.
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_ptask_create] (0x0400): Periodic task [Check if online (periodic)] was created
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_ptask_schedule] (0x0400): Task [Check if online (periodic)]: scheduling task 81 seconds from now [1439901283]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_run_offline_cb] (0x0080): Going offline. Running callbacks.
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sdap_id_op_connect_done] (0x4000): notify offline to op #1
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [acctinfo_callback] (0x0100): Request processed. Returned 1,11,Offline
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sdap_id_release_conn_data] (0x4000): releasing unused connection
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sbus_dispatch] (0x4000): dbus conn: 0xb99770
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sbus_dispatch] (0x4000): Dispatching.
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sbus_message_handler] (0x4000): Received SBUS method [getAccountInfo]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sbus_get_sender_id_send] (0x2000): Not a sysbus message, quit
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [sbus_handler_got_caller_id] (0x4000): Received SBUS method [getAccountInfo]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_get_account_info] (0x0200): Got request for [0x1001][1][idnumber=123456]
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_get_account_info] (0x0100): Request processed. Returned 1,11,Fast reply - offline
(Tue Aug 18 14:33:22 2015) [sssd[be[example.com]]] [be_req_set_domain] (0x0400): Changing request domain from [example.com] to [example.com]



Breakpoint 1, be_mark_offline (ctx=0x24a0070) at src/providers/data_provider_be.c:482
482 {
(gdb) bt
#0  be_mark_offline (ctx=0x24a0070) at src/providers/data_provider_be.c:482
#1  0x00007fb92a8f1225 in sdap_id_op_connect_done (subreq=0x0) at src/providers/ldap/sdap_id_op.c:613
#2  0x0000003988004bde in tevent_req_finish (req=<value optimized out>, error=<value optimized out>, location=<value optimized out>) at ../tevent_req.c:110
#3  _tevent_req_error (req=<value optimized out>, error=<value optimized out>, location=<value optimized out>) at ../tevent_req.c:128
#4  0x0000003988004bde in tevent_req_finish (req=<value optimized out>, error=<value optimized out>, location=<value optimized out>) at ../tevent_req.c:110
#5  _tevent_req_error (req=<value optimized out>, error=<value optimized out>, location=<value optimized out>) at ../tevent_req.c:128
#6  0x00007fb92a8e75c4 in sdap_auth_done (subreq=0x4608e10) at src/providers/ldap/sdap_async_connection.c:1360
#7  0x00000039880040c8 in tevent_common_loop_immediate (ev=0x2499bb0) at ../tevent_immediate.c:135
#8  0x0000003988008caf in epoll_event_loop_once (ev=0x2499bb0, location=<value optimized out>) at ../tevent_epoll.c:911
#9  0x00000039880072e6 in std_event_loop_once (ev=0x2499bb0, location=0x3989453e20 "src/util/server.c:668") at ../tevent_standard.c:112
#10 0x000000398800349d in _tevent_loop_once (ev=0x2499bb0, location=0x3989453e20 "src/util/server.c:668") at ../tevent.c:530
#11 0x000000398800351b in tevent_common_loop_wait (ev=0x2499bb0, location=0x3989453e20 "src/util/server.c:668") at ../tevent.c:634
#12 0x0000003988007256 in std_event_loop_wait (ev=0x2499bb0, location=0x3989453e20 "src/util/server.c:668") at ../tevent_standard.c:138
#13 0x000000398943b8d3 in server_loop (main_ctx=0x249af20) at src/util/server.c:668
#14 0x000000000040b346 in main (argc=8, argv=<value optimized out>) at src/providers/data_provider_be.c:2916

jhrozek commented 8 years ago

Thanks for the bug report. See:
https://fedorahosted.org/sssd/wiki/DesignDocs/OneWayTrusts#Subdomainofflinestatuschanges
and:
https://fedorahosted.org/sssd/ticket/2637

I think that would help your situation, right?

vokac commented 8 years ago

Yes, looks like described design should fix issues with our configuration.

jhrozek commented 8 years ago

OK, thanks for confirming, marking as a duplicate

resolution: => duplicate
status: new => closed

Metadata Update from @vokac:
- Issue set to the milestone: NEEDS_TRIAGE

7 years ago

pbrezina commented 3 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/3802

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata

Assignee

None

Tags

None

Blocking

None

Depending on

None

Priority

major

Milestone

NEEDS_TRIAGE

type

defect

component

SSSD

version

1.12.2

selected

None

testsupdated

patch

rhbz

None

design_review

review

changelog

None

keywords

None

coverity

None

mark

blocking

None

design

None

sensitive

None

blockedby

None

feature_milestone

None

SSSD / sssd

Source Code

Documentation

#2761 AD provider offline when trusted domain not reachable Closed: Duplicate None Opened 8 years ago by vokac.

Metadata

#2761 AD provider offline when trusted domain not reachable

Closed: Duplicate None Opened 8 years ago by vokac.