Around 50% of attempts to setup a replica of a freeipa master with topology plugin enabled (domain level set to 1.0) end up with the following error message in the stdoutput:
[error] RuntimeError: One of the ldap service principals is missing.
Replication agreement cannot be converted.
Replication error message: Unable to acquire replicaLDAP error: No such
(description taken from Oleg Fayans mail)
Light investigation tells that:
It seems that data from master were replicated to new replica but new replica entries(host, services) were not replicated back to master.
The installation then hangs on replica's check if its ldap service principal is on master.
Given that it happens only with domain level 1 (which enables topology plugin) the topology plugin is probable culprit.
Just to confirm the scenario:
So far I haven't run into this error in my own tests, could you provide DS access and error logs for a good and bad runs ?
I will again try to reproduce, but maybe there is something visible in teh logs.
Linked to Bugzilla bug: https://bugzilla.redhat.com/show_bug.cgi?id=1199516 (Red Hat Enterprise Linux 7)
Investigating the environemnt provided by PetrV I found two issues, which could be related
1] slapd crashes at shutdown
the reason is that the topology pluging is stopped before the replication plugin. the topo plugin frees its allocated data structures, but the repl plugin will write changes to the agreements to the dse.ldif and in the preop the topo plugij is called again and tries to access the (freed) structures.
This has to be prevented, the topo plugin has to be set to inactive in the close function.
2] replication fails from the replica to the master.
The failure is that the agreement is set to use GSSAPI but the KDC is not running.
Installation fails when trying to setup KDC: the ldap principal is added to the database and should be replicated to the master, but since KDC is not there yet this fails.
And since the principal will not be found on the master setup of KDC also fails.
Question is: why is the agreement set to use GSSAPI before KDC is setup.
It could be an effect of the crashes at shutdown, when at startup the topology plugin updates a possibly incomplete agreement
Next step: fix the shutdown crash and repeat test scenario, if problem still exist trace chanegs to repl agreement to detect when and why gssapi is enabled
the two issues are not related. the replication failure is caused by the topology plugin startup code.
It checks if segements adn agreements match and updates managed agreements from the segments (segments have priority). but for bind method always a default GSSAPI was set, and so the agmt was changed to use GSSAPI before it could work.
a patch for both issues is avaialble
How to verify ?
There was no systematic reproducer, problems occurred during normal replica installation and shutdown of directory server.
So I think a test cpuld only do a sanity check:
Metadata Update from @pvoborni:
- Issue assigned to lkrispen
- Issue set to the milestone: FreeIPA 4.2
to comment on this ticket.