#2015 SSSD fails to connect with ipv4_first when on a machine with only IPv6 and server is dual-stack
Opened 5 years ago by stgraber. Modified 9 months ago

I recently started cutting IPv4 access to a bunch of VMs on my network as they no longer required it to function, resulting in those only having IPv6 connectivity.

I had previously confirmed that sssd was indeed properly connecting to my samba4 servers over IPv6 and so expected the switch to go seamlessly, but it didn't.

As far as I can tell, the problem is that SSSD properly finds the two samba4 servers from the SRV records, then queries for A record from the DNS server, which sure enough returns both IPv4 addresses, it then attempts to contact those and fails (Network is unreachable). Instead of then doing a AAAA query and using it, sssd appears to just give up.

I see two problems there:
- SSSD should have favored IPv6 to start with (to match the libc's behaviour)
- SSSD shouldn't fail to connect completely when getting "Network is unreachable", instead it should try the next protocol and try that

In a perfect world, I'd have expected SSSD to first query for AAAA (or just use getaddrinfo?), try to connect to those addresses and in case of failure, fallback to IPv4.

I'm attaching a log file containing:
- log output of sssd (-d4 -i)
- proof that getent and netcat both perfectly manage to resolve the same host and connect to it
- tcpdump of the sssd run
- tcpdump of the getent call


Hi, thank you for the bug report.

Did you have a chance to take a look at the lookup_family_order parameter of sssd ? The thing is that glibc's interfaces like getaddrinfo are blocking and SSSD is completely asynchronous, so we decided to use the c-ares library instead. The lookup_family_order parameter controls which address families the SSSD should be using, defaulting to ipv4_first.

Hi,

I somehow missed lookup_family_order when looking for IPv6 related options yesterday, lookup_family_order = ipv6_first indeed fixes the issue for me.

I still think there's a bug with ipv4_first not working at all on IPv6-only machines, but at least setting the option to ipv6_first gives a working workaround.

Thanks!

According to the logs you sent you are right. I'll change the title of the ticket, unfortunately I don't have any IPv6 only machine to test at the moment.

Thanks for testing the ipv6_first parameter.

Fields changed

summary: SSSD fails to connect when on a machine with only IPv6 and server is dual-stack => SSSD fails to connect with ipv4_first when on a machine with only IPv6 and server is dual-stack

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.13 beta
rhbz: => todo

Got same problem with sssd-1.8 series. But in our case, we have mixed IPv4/IPv6 and IPv6 only servers, so we cannot deploy same config on all servers. Is there any quick fix for broken ipv4_first? Thanks.

With mixed network (IPv4 and IPv6) option "lookup_family_order = ipv6_first" doesn't work either. sssd is not going to "ldap_uri".
Looks like there is a bunch of complicated issues around lookup_family_order...

_comment0: With mixed network (IPv4 and IPv6) option "lookup_family_order = ipv6_first" doesn't work either. Looks like there is a bunch of complicated issues around lookup_family_order... => 1375111130743718

Replying to [comment:7 bzhmurov]:

With mixed network (IPv4 and IPv6) option "lookup_family_order = ipv6_first" doesn't work either. sssd is not going to "ldap_uri".
Looks like there is a bunch of complicated issues around lookup_family_order...

Hi, can you attach (sanitized) logs? Simply put debug_level=9 into the domain section, restart sssd and then attach /var/log/sssd/sssd_$domain.log.

With the log we should at least see what kind of queries the SSSD did.

Here is log file.
http://kernelpanic.ru/sssd.txt

The case is:
1. we have ldap.domain.com.
2. it has IPv4 and IPv6 addresses (A and AAAA records for ldap.domain.com)
3. we have "lookup_family_order = ipv6_first" option enabled
4. on IPv4 + IPv6 clients everything is OK
5. on IPv4 only hosts sssd does not work, because sssd tries to resolv ldap.domain.com, finds
AAAA IPv6 record and ignore IPv4 A dns record. But clients, that does not have IPv4 only
address, cannot connect to IPv6 AAAA address, so sssd gets:
"connect failed [101][Network is unreachable]"

Ah thanks for the logs, I see what the problem is now. But frankly I'm not sure if it's a bug..the lookup_family_order option only affects the name resolution, so when the name was successfully resolved already, but the server is down, we don't retry the second address, the host is simply marked as being down. I think this use case would better be served by using "ipv4_first" or "ipv4_only" rather than changing the failover mechanism.

So we have to figure out, what version of IP server has (IPv4 or IPv6) and change sssd.conf dynamically while installation process? Isn't it ugly? And what if we have to migrate from IPv4 to IPv6? We have not to forget to change sssd.conf... In our case (thousands servers) it kinda looks like hell :(

Fields changed

mark: => 0
owner: somebody => preichl
sensitive: => 0

This requires larger changes in the responder and fail over which are unfortunately out of scope for 1.14 at the moment.

milestone: SSSD 1.14 beta => SSSD 1.15 beta

Metadata Update from @stgraber:
- Issue assigned to preichl
- Issue set to the milestone: SSSD Future releases (no date set yet)

2 years ago

Metadata Update from @jhrozek:
- Custom field design_review reset (from 0)
- Custom field mark reset (from 0)
- Custom field patch reset (from 0)
- Custom field review reset (from 0)
- Custom field sensitive reset (from 0)
- Custom field testsupdated reset (from 0)
- Issue close_status updated to: None

2 years ago

I would say this is a particular case of https://pagure.io/SSSD/sssd/issue/552 but the workaround there makes no sense for this case.

Certainly I'd like to be able to dhcp machine and not have to configrue sssd.conf depending on what comes back.

Login to comment on this ticket.

Metadata