#3214 Connection errors during ipa-replica-manage re-initialize breaks subsequent re-initialize attempts
Closed: Fixed None Opened 11 years ago by jraquino.

A network failure during a re-initialization of a FreeIPA server results in a broken and empty consumer replica.

Subsequent attempts to re-initialize the consumer fail due to GSSAPI problems.

In order to workaround the issue, it is necessary to change the Replica Agreement on the Supplier to use SIMPLE BIND and TLS, and to change the agreement on the Consumer to allow Directory Manager to perform the replication.

After that, you can perform a clean re-initialize, then you have to undo the SIMPLE BIND and TLS.

<Network Error during Re-init>
ipa-replica-manage re-initialize --from=ipa-supplier.example.com
ipa: INFO: Setting agreement cn=meToipa-supplier.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement cn=meToipa-supplier.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
Update in progress
Update in progress
Update in progress
Update in progress
Update in progress
Update in progress
Update in progress
[ipa-supplier.example.com] reports: Update failed! Status: [-5  - System error]

<Subsequent Re-Init Attempt>

ipa-replica-manage re-initialize --from=ipa-supplier.example.com
Directory Manager password:

ipa: INFO: Setting agreement cn=meToipa-consumer.example.com,cn=replica,cn=dc\3Dexpertcity\2Cdc\3Dcom,cn=mapping tree,cn=config schedule to 2358-2359 0 to force synch
ipa: INFO: Deleting schedule 2358-2359 0 from agreement cn=meToipa-consumer.example.com,cn=replica,cn=dc\3Dexpertcity\2Cdc\3Dcom,cn=mapping tree,cn=config
[ipa-supplier.example.com] reports: Update failed! Status: [49  - LDAP error: Invalid credentials]

<Supplier LDIF Adjustment>

dn: cn=meToipa-consumer.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
changetype: modify
replace: nsds5replicabinddn
nsds5replicabinddn: cn=directory manager
-
replace: nsds5replicacredentials
nsds5replicacredentials: <Clear Text Directory Manager Password>
-
replace: nsds5replicatransportinfo
nsds5replicatransportinfo: TLS
-
replace: nsds5replicabindmethod
nsds5replicabindmethod: SIMPLE

<Consumer LDIF Cleanup>
dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
changetype: modify
add: nsds5replicabinddn
nsds5replicabinddn: cn=directory manager

<Supplier LDIF Restore Original Settings>
dn: cn=meToipa-consumer.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
changetype: modify
delete: nsds5replicabinddn
-
delete: nsds5replicacredentials
-
replace: nsds5replicatransportinfo
nsds5replicatransportinfo: LDAP
-
replace: nsds5replicabindmethod
nsds5replicabindmethod: SASL/GSSAPI

When the supplier re-initializes the consumer, the first thing it does is map the kerberos principal name to the ldap entry, which happens to reside in the replicated area. The next thing it does is wipe out the consumer database. If the re-init fails, it leaves the consumer with no database, which means kerberos principal to entry mapping will always fail.

Would it be easier to recreate a package and reinstall?
Should we have an option to clean the replication agreement with a replica if we are told to regenerate the package for existing but failed replica?

Replying to [comment:2 dpal]:

Would it be easier to recreate a package and reinstall?

That would work, sure, but it would take a lot longer.

Should we have an option to clean the replication agreement with a replica if we are told to regenerate the package for existing but failed replica?

Yes, but this procedure might be easier, and you might run into problems where the replica ID in the RUV is already out "in the wild" and you would have to CLEANALLRUV.

This seems like a good candidate for a trouble-shooting guide. It is likely a fairly rare occurrence but fixing it this way seems preferable to reinstalling the entire replica.

I previously had been doing full re-installs and ruv cleanups of the ghosts, until Rich gave me the better way documented above. It would be a little easier if the ipa-replica-manage tool supported the flipping options, but if it were just a trouble-shooting guide, that too would probably suffice.

It's a very scary condition to find yourself it, so I am glad we have at least captured the exact steps to get out of the hole.

Create a write-up and post on wiki or mailing list.

Looking into this issue, it seem a workaround to fix the situation could be to simply change the broken ds configuration so that the sasl mapping actually succeeds and replication can be resumed with a new reinit.

That could be accomplished with a fallback sasl mapping that maps specific principals to entries in cn=config.

We might even code this up at some point in to the ipa-replica-manage tool.

Replying to [comment:10 simo]:

Looking into this issue, it seem a workaround to fix the situation could be to simply change the broken ds configuration so that the sasl mapping actually succeeds and replication can be resumed with a new reinit.

That could be accomplished with a fallback sasl mapping that maps specific principals to entries in cn=config.

Yes, that will work. As long as the principal maps to some entry.

We might even code this up at some point in to the ipa-replica-manage tool.

Metadata Update from @jraquino:
- Issue assigned to simo
- Issue set to the milestone: FreeIPA 3.1 Stabilization

7 years ago

Login to comment on this ticket.

Metadata