#1699 ldap_chpass_uri failover fails on using same hostname
Closed: Fixed None Opened 6 years ago by jhrozek.

https://bugzilla.redhat.com/show_bug.cgi?id=884600 (Red Hat Enterprise Linux 6)

Description of problem:
ldap_chpass_uri failover fails on using same hostname

Version-Release number of selected component (if applicable):
sssd-1.9.2-30.el6

How reproducible:
Always

Steps to Reproduce:
1. sssd.conf domain section has:
ldap_uri =
ldap://ldapserver.example.com:12345,ldap://ldapserver.example.com:389
ldap_chpass_uri =
ldap://ldapserver.example.com:12345,ldap://ldapserver.example.com:389

2. Try to change the password of a user
# ssh -l puser1 localhostpuser1@localhost's password:
Last login: Thu Dec  6 16:11:03 2012 from localhost
-sh-4.1$ passwd
Changing password for user puser1.
Current Password:
passwd: Authentication token manipulation error
-sh-4.1$


Actual results:
Password change fails.
Log shows:
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_pam_chpass_handler] (0x0040):
starting password change request for user [puser1].
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0100):
Trying to resolve service 'LDAP_CHPASS'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_server_status] (0x1000):
Status of server 'ldapserver.example.com' is 'working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_port_status] (0x1000): Port
status of port 12345 for server 'ldapserver.example.com' is 'neutral'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]]
[fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 10
seconds
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_server_status] (0x1000):
Status of server 'ldapserver.example.com' is 'working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_resolve_server_process]
(0x1000): Saving the first resolved server
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_resolve_server_process]
(0x0200): Found address for server ldapserver.example.com: [192.168.122.13] TTL
604800
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_uri_callback] (0x0400):
Constructed uri 'ldap://ldapserver.example.com:12345'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_send] (0x4000):
Using file descriptor [22] for LDAP connection.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_send] (0x0400):
Setting 6 seconds timeout for connecting
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_async_sys_connect_done]
(0x0020): connect failed [111][Connection refused].
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_sys_connect_done]
(0x0020): sdap_async_sys_connect request failed.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_sys_connect_done] (0x0020):
sdap_async_connect_call request failed.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_handle_release] (0x2000):
Trace: sh[0x13f2a30], connected[0], ops[(nil)], ldap[(nil)],
destructor_lock[0], release_memory[0]
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [fo_set_port_status] (0x0100):
Marking port 12345 of server 'ldapserver.example.com' as 'not working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_pam_handler_callback] (0x0100):
Backend returned: (3, 4, <NULL>) [Internal Error (System error)]


Expected results:
Password change should work.

Additional info:
Works fine with different hostnames:
ldap_chpass_uri = ldap://invalidsrv.example.com,ldap://ldapserver.example.com

Might be the same as #1680.

blockedby: =>
blocking: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
milestone: NEEDS_TRIAGE => SSSD 1.9.4
testsupdated: => 0

Fields changed

owner: somebody => pbrezina
status: new => assigned

Please check if this has the same cause as #1680.

I'm going to say yes. I don't have a solution so far though.

OK. Found it.

The problem is here:

static void auth_connect_done(struct tevent_req *subreq)
{
    struct tevent_req *req = tevent_req_callback_data(subreq,
                                                      struct tevent_req);
    struct auth_state *state = tevent_req_data(req,
                                                    struct auth_state);
    int ret;

    ret = sdap_connect_recv(subreq, state, &state->sh);
    talloc_zfree(subreq);
    if (ret) {
        if (state->srv) {
            /* mark this server as bad if connection failed */
            be_fo_set_port_status(state->ctx->be,
                                  state->sdap_service->name,
                                  state->srv, PORT_NOT_WORKING);
        }
        if (ret == ETIMEDOUT) {
            if (auth_get_server(req) == NULL) {
                tevent_req_error(req, ENOMEM);
            }
            return;
        }

        tevent_req_error(req, ret);
        return;
    }
...

auth_get_server() starts resolving next server.
Does anyone know if there is any specific reason why we continue with the next server only if ETIMEOUT?

Fields changed

patch: 0 => 1

It is not the same problem as #1680 after all.

Fields changed

resolution: => fixed
status: assigned => closed

Metadata Update from @jhrozek:
- Issue assigned to pbrezina
- Issue set to the milestone: SSSD 1.9.4

2 years ago

Login to comment on this ticket.

Metadata