#1699 ldap_chpass_uri failover fails on using same hostname
Closed: Fixed None Opened 7 years ago by jhrozek.

https://bugzilla.redhat.com/show_bug.cgi?id=884600 (Red Hat Enterprise Linux 6)

Description of problem:
ldap_chpass_uri failover fails on using same hostname

Version-Release number of selected component (if applicable):
sssd-1.9.2-30.el6

How reproducible:
Always

Steps to Reproduce:
1. sssd.conf domain section has:
ldap_uri =
ldap://ldapserver.example.com:12345,ldap://ldapserver.example.com:389
ldap_chpass_uri =
ldap://ldapserver.example.com:12345,ldap://ldapserver.example.com:389

2. Try to change the password of a user
# ssh -l puser1 localhostpuser1@localhost's password:
Last login: Thu Dec  6 16:11:03 2012 from localhost
-sh-4.1$ passwd
Changing password for user puser1.
Current Password:
passwd: Authentication token manipulation error
-sh-4.1$


Actual results:
Password change fails.
Log shows:
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_pam_chpass_handler] (0x0040):
starting password change request for user [puser1].
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [fo_resolve_service_send] (0x0100):
Trying to resolve service 'LDAP_CHPASS'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_server_status] (0x1000):
Status of server 'ldapserver.example.com' is 'working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_port_status] (0x1000): Port
status of port 12345 for server 'ldapserver.example.com' is 'neutral'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]]
[fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set to 10
seconds
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [get_server_status] (0x1000):
Status of server 'ldapserver.example.com' is 'working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_resolve_server_process]
(0x1000): Saving the first resolved server
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_resolve_server_process]
(0x0200): Found address for server ldapserver.example.com: [192.168.122.13] TTL
604800
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_uri_callback] (0x0400):
Constructed uri 'ldap://ldapserver.example.com:12345'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_send] (0x4000):
Using file descriptor [22] for LDAP connection.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_send] (0x0400):
Setting 6 seconds timeout for connecting
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_async_sys_connect_done]
(0x0020): connect failed [111][Connection refused].
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sss_ldap_init_sys_connect_done]
(0x0020): sdap_async_sys_connect request failed.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_sys_connect_done] (0x0020):
sdap_async_connect_call request failed.
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [sdap_handle_release] (0x2000):
Trace: sh[0x13f2a30], connected[0], ops[(nil)], ldap[(nil)],
destructor_lock[0], release_memory[0]
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [fo_set_port_status] (0x0100):
Marking port 12345 of server 'ldapserver.example.com' as 'not working'
(Thu Dec  6 16:13:46 2012) [sssd[be[LDAP]]] [be_pam_handler_callback] (0x0100):
Backend returned: (3, 4, <NULL>) [Internal Error (System error)]


Expected results:
Password change should work.

Additional info:
Works fine with different hostnames:
ldap_chpass_uri = ldap://invalidsrv.example.com,ldap://ldapserver.example.com

Might be the same as #1680.

blockedby: =>
blocking: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
milestone: NEEDS_TRIAGE => SSSD 1.9.4
testsupdated: => 0

Fields changed

owner: somebody => pbrezina
status: new => assigned

Please check if this has the same cause as #1680.

I'm going to say yes. I don't have a solution so far though.

OK. Found it.

The problem is here:

static void auth_connect_done(struct tevent_req *subreq)
{
    struct tevent_req *req = tevent_req_callback_data(subreq,
                                                      struct tevent_req);
    struct auth_state *state = tevent_req_data(req,
                                                    struct auth_state);
    int ret;

    ret = sdap_connect_recv(subreq, state, &state->sh);
    talloc_zfree(subreq);
    if (ret) {
        if (state->srv) {
            /* mark this server as bad if connection failed */
            be_fo_set_port_status(state->ctx->be,
                                  state->sdap_service->name,
                                  state->srv, PORT_NOT_WORKING);
        }
        if (ret == ETIMEDOUT) {
            if (auth_get_server(req) == NULL) {
                tevent_req_error(req, ENOMEM);
            }
            return;
        }

        tevent_req_error(req, ret);
        return;
    }
...

auth_get_server() starts resolving next server.
Does anyone know if there is any specific reason why we continue with the next server only if ETIMEOUT?

Fields changed

patch: 0 => 1

It is not the same problem as #1680 after all.

Fields changed

resolution: => fixed
status: assigned => closed

Metadata Update from @jhrozek:
- Issue assigned to pbrezina
- Issue set to the milestone: SSSD 1.9.4

3 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/2741

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata