#7642 Installation fails: Replica Busy
Closed: fixed 5 years ago Opened 5 years ago by cheimes.

On PR-CI, replication installation sometimes fails with this error message:

...
 [27/41]: ignore time skew for initial replication
 [28/41]: setting up initial replication
Starting replication, please wait until this has completed.
[ldap://master.ipa.test:389] reports: Replica Busy! Status: [Error (1) Replication error acquiring replica: replica busy]
 [error] RuntimeError: Failed to start replication

http://freeipa-org-pr-ci.s3-website.eu-central-1.amazonaws.com/jobs/14bf089e-86c9-11e8-9cea-fa163ecbac3f/

debug logs

2018-07-13T18:28:34Z DEBUG Restart of dirsrv@IPA-TEST.service complete
2018-07-13T18:28:34Z DEBUG Created connection context.ldap2_140070792037488
2018-07-13T18:28:34Z DEBUG Fetching nsDS5ReplicaId from master [attempt 1/5]
2018-07-13T18:28:34Z DEBUG retrieving schema for SchemaCache url=ldap://master.ipa.test:389 conn=<ldap.ldapobject.SimpleLDAPObject object at 0x7f64c333dfd0>
2018-07-13T18:28:34Z DEBUG Successfully updated nsDS5ReplicaId.
2018-07-13T18:28:34Z DEBUG Add or update replica config cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config
2018-07-13T18:28:34Z DEBUG Added replica config cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config
2018-07-13T18:28:34Z DEBUG Fetching nsDS5ReplicaId from master [attempt 1/5]
2018-07-13T18:28:34Z DEBUG Successfully updated nsDS5ReplicaId.
2018-07-13T18:28:34Z DEBUG Add or update replica config cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config
2018-07-13T18:28:34Z DEBUG Added replica config cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config
2018-07-13T18:28:34Z DEBUG Waiting for replication (ldap://master.ipa.test:389) cn=meToreplica0.ipa.test,cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config (objectclass=*)
2018-07-13T18:28:34Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meToreplica0.ipa.test,cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config'), {'objectClass': [b'nsds5replicationagreement', b'top'], 'cn': [b'meToreplica0.ipa.test'], 'nsDS5ReplicaHost': [b'replica0.ipa.test'], 'nsDS5ReplicaPort': [b'389'], 'nsds5replicaTimeout': [b'120'], 'nsDS5ReplicaRoot': [b'dc=ipa,dc=test'], 'description': [b'me to replica0.ipa.test'], 'nsDS5ReplicatedAttributeList': [b'(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsDS5ReplicaTransportInfo': [b'LDAP'], 'nsDS5ReplicaBindMethod': [b'SASL/GSSAPI'], 'nsds5ReplicaStripAttrs': [b'modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], 'nsDS5ReplicatedAttributeListTotal': [b'(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsds5replicareapactive': [b'0'], 'nsds5replicaLastUpdateStart': [b'19700101000000Z'], 'nsds5replicaLastUpdateEnd': [b'19700101000000Z'], 'nsds5replicaChangesSentSinceStartup': [b''], 'nsds5replicaLastUpdateStatus': [b'Error (0) No replication sessions started since server startup'], 'nsds5replicaUpdateInProgress': [b'FALSE'], 'nsds5replicaLastInitStart': [b'19700101000000Z'], 'nsds5replicaLastInitEnd': [b'19700101000000Z']})]
2018-07-13T18:28:34Z DEBUG Waiting for replication (ldapi://%2fvar%2frun%2fslapd-IPA-TEST.socket) cn=meTomaster.ipa.test,cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config (objectclass=*)
2018-07-13T18:28:34Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meTomaster.ipa.test,cn=replica,cn=dc\=ipa\,dc\=test,cn=mapping tree,cn=config'), {'objectClass': [b'nsds5replicationagreement', b'top'], 'cn': [b'meTomaster.ipa.test'], 'nsDS5ReplicaHost': [b'master.ipa.test'], 'nsDS5ReplicaPort': [b'389'], 'nsds5replicaTimeout': [b'120'], 'nsDS5ReplicaRoot': [b'dc=ipa,dc=test'], 'description': [b'me to master.ipa.test'], 'nsDS5ReplicatedAttributeList': [b'(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsDS5ReplicaTransportInfo': [b'LDAP'], 'nsDS5ReplicaBindMethod': [b'SASL/GSSAPI'], 'nsds5ReplicaStripAttrs': [b'modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], 'nsDS5ReplicatedAttributeListTotal': [b'(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsds5replicareapactive': [b'0'], 'nsds5replicaLastUpdateStart': [b'19700101000000Z'], 'nsds5replicaLastUpdateEnd': [b'19700101000000Z'], 'nsds5replicaChangesSentSinceStartup': [b''], 'nsds5replicaLastUpdateStatus': [b'Error (0) No replication sessions started since server startup'], 'nsds5replicaUpdateInProgress': [b'FALSE'], 'nsds5replicaLastInitStart': [b'19700101000000Z'], 'nsds5replicaLastInitEnd': [b'19700101000000Z']})]
2018-07-13T18:28:36Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ipaserver/install/service.py", line 605, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python3.6/site-packages/ipaserver/install/service.py", line 591, in run_step
    method()
  File "/usr/lib/python3.6/site-packages/ipaserver/install/dsinstance.py", line 456, in __setup_replica
    cacert=self.ca_file
  File "/usr/lib/python3.6/site-packages/ipaserver/install/replication.py", line 1782, in setup_promote_replication
    raise RuntimeError("Failed to start replication")
RuntimeError: Failed to start replication

2018-07-13T18:28:36Z DEBUG   [error] RuntimeError: Failed to start replication

DS error log on replica

13/Jul/2018:18:28:34.691953812 +0000] - INFO - slapd_daemon - slapd started.  Listening on All Interfaces port 389 for LDAP requests
[13/Jul/2018:18:28:34.693728066 +0000] - INFO - slapd_daemon - Listening on /var/run/slapd-IPA-TEST.socket for LDAPI requests
[13/Jul/2018:18:28:35.038954814 +0000] - ERR - NSMMReplicationPlugin - acquire_replica - agmt="cn=meTomaster.ipa.test" (master:389): Unable to acquire replica: permission denied. The bind dn "" does not have permission to supply replication updates to the replica. Will retry later.
[13/Jul/2018:18:28:38.059021906 +0000] - WARN - NSMMReplicationPlugin - repl5_inc_run - agmt="cn=meTomaster.ipa.test" (master:389): The remote replica has a different database generation ID than the local database.  You may have to reinitialize the remote replica, or the local replica.

DS error log on master

[13/Jul/2018:18:28:35.238947347 +0000] - NOTICE - NSMMReplicationPlugin - changelog program - _cl5ConstructRUV - Rebuilding the replication changelog RUV, this may take several minutes...
[13/Jul/2018:18:28:35.239557039 +0000] - NOTICE - NSMMReplicationPlugin - changelog program - _cl5ConstructRUV - Rebuilding replication changelog RUV complete.  Result 0 (Success)
[13/Jul/2018:18:28:35.240613626 +0000] - NOTICE - NSMMReplicationPlugin - changelog program - _cl5ConstructRUV - Rebuilding the replication changelog RUV, this may take several minutes...
[13/Jul/2018:18:28:35.241164005 +0000] - NOTICE - NSMMReplicationPlugin - changelog program - _cl5ConstructRUV - Rebuilding replication changelog RUV complete.  Result 0 (Success)
[13/Jul/2018:18:28:35.306506766 +0000] - ERR - NSMMReplicationPlugin - release_replica - agmt="cn=meToreplica0.ipa.test" (replica0:389): Unable to parse the response  to the endReplication extended operation.
[13/Jul/2018:18:28:35.326941492 +0000] - ERR - NSMMReplicationPlugin - multimaster_extop_StartNSDS50ReplicationRequest - conn=72 op=5 replica="dc=ipa,dc=test": Unable to acquire replica: error: permission denied
[13/Jul/2018:18:28:35.405220318 +0000] - NOTICE - NSMMReplicationPlugin - acquire_replica - agmt="cn=meToreplica0.ipa.test" (replica0:389): Unable to acquire replica: the replica is currently being updatedby another supplier.

@tbordaz mentioned:

I think this error is a consequence https://pagure.io/freeipa/issue/7617 . Recent versions of 389-ds (>=1.4.0.12) will workaround this issue as they contain a fix (https://pagure.io/389-ds-base/issue/49818) that limit the impact of https://pagure.io/freeipa/issue/7617.

In conclusion, I think running PR-CI with last version of 389-ds (>= 1.4.0.12) will fix this problem.

As of today, latest package available in the repositories is 389-ds-base-1.4.0.11-2.fc28.x86_64.

It seems this will be fixed when the new package lands the repos.

Currently PR-CI is using 389-ds-base-1.4.0.16-1.fc28.x86_64.

Recent pull requests don't seem to be failing because of this anymore.

@abiagion , @cheimes, shouldn't we close this ticket since it is fixed with https://pagure.io/389-ds-base/issue/49818 that is now available.

freeipa.spec.in should be updated to require 1.4.0.16-1. It is currently set to 1.4.0.8-1.

Metadata Update from @frenaud:
- Issue assigned to frenaud

5 years ago

Metadata Update from @frenaud:
- Custom field on_review adjusted to https://github.com/freeipa/freeipa/pull/2433

5 years ago

Setting milestone to ipa 4.7.2 as the fix would apply to this version, too (not only master).

Metadata Update from @frenaud:
- Issue set to the milestone: FreeIPA 4.7.2

5 years ago

master:

  • 3eac88a Bump requires 389-ds-base

ipa-4-7:

  • d0978c8 Bump requires 389-ds-base

ipa-4-6:

  • 18fa0d7 Bump requires 389-ds-base
  • 2ba908d Adapt backport to ipa-4-6 branch

Metadata Update from @frenaud:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

5 years ago

Milestone adjusted to 4.6.5 as the fix was also backported to ipa-4-6 branch.

Metadata Update from @frenaud:
- Issue set to the milestone: FreeIPA 4.6.5 (was: FreeIPA 4.7.2)

5 years ago

Login to comment on this ticket.

Metadata