#48854 Running db2index with no options breaks replication
Closed: wontfix None Opened 7 years ago by nhosoi.

Description of problem:

It was found that running db2index with no command line arguments results in
breaking replication by changing the slave's RUV tombstone.

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-32.el6_5.x86_64 and
389-ds-base-1.2.11.15-32.1.el6_5.bug1009122.x86_64

How reproducible:

This happened on 10+ slaves in our production environment, however, it did not
occur in pre-prod environments.  In production, I noticed some DB errors, which
may be related:

[08/Oct/2014:14:47:55 -0400] upgrade DB - userRoot: Start upgradedb.
[08/Oct/2014:14:47:55 -0400] - WARNING: Import is running with
nsslapd-db-private-import-mem on; No other process is allowed to access the
database
[08/Oct/2014:14:47:55 -0400] - reindex userRoot: Index buffering enabled with
bucket size 100
[08/Oct/2014:14:47:56 -0400] entryrdn-index - entryrdn_lookup_dn: Failed to
position cursor at the key: P2992: DB_PAGE_NOTFOUND: Requested page not
found(-30986)
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: Skipping entry
"nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff" which has no parent, ending at
line 119705 of file "id2entry.db4"
[08/Oct/2014:14:48:12 -0400] - reindex userRoot: WARNING: bad entry: ID 119705
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers finished; cleaning
up...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Workers cleaned up.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Cleaning up producer thread...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Indexing complete.
Post-processing...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Generating numSubordinates
complete.
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Flushing caches...
[08/Oct/2014:14:48:14 -0400] - reindex userRoot: Closing files...
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped
[08/Oct/2014:14:48:16 -0400] - reindex userRoot: Reindexing complete.
Processed 126506 entries (1 were skipped) in 21 seconds. (6024.10 entries/sec)
[08/Oct/2014:14:48:16 -0400] - All database threads now stopped

I also noticed the tool stating 'upgrade DB - userRoot: Start upgradedb.', that
next should probably be changed to 'Start indexing'.


Steps to Reproduce:
1. service dirsrv stop
2. /usr/lib64/dirsrv/slapd-*/db2index
3. service dirsrv start

Actual results:

Replication fails with:

[08/Oct/2014:16:09:09 -0400] NSMMReplicationPlugin -
agmt="cn=FQDM" (ldap02:636): Replica has a different
generation ID than the local data. on the master


Expected results:

Replication to resume once dirsrv is started.


Additional info:
Master:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 51154ed3003d00010000
nsds50ruv: {replica 1 ldap://FQDN:389} 51154ed3003e00010000 56edd86c000000010000
nsds50ruv: {replica 2 ldap://FQDN:389} 51159de3000000020000 56ed65d1000200020000
dc: test
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 5435b09b
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000

Slave:
$ ldapsearch -xLLL -h FQDN -D "uid=manager" -W -b dc=test,dc=com
'(&(objectclass=nstombstone)(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff))'
Enter LDAP Password:
dn: nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff,dc=test,dc=com
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 56ed21150001ffff0000
nsds50ruv: {replica 2 ldap://FQDN:389}
nsds50ruv: {replica 1 ldap://FQDN:389}
dc: test
nsruvReplicaLastModified: {replica 2 ldap://FQDN:389} 00000000
nsruvReplicaLastModified: {replica 1 ldap://FQDN:389} 00000000

Reinitialzing the consume restored replication.

Actually, it is a bug in import_foreman which was not adjusted when the backend RUV entry was redesigned.

Provided there are no other id's that match ffffff...., this is okay. Ack from me.

Reviewed by William (Thank you!!)

Pushed to master:
2e5f0ff..ba3b844 master -> master
commit ba3b844

Pushed to 389-ds-base-1.3.4:
b69db2a..14bc6ce 389-ds-base-1.3.4 -> 389-ds-base-1.3.4
commit 14bc6ce

Pushed to 389-ds-base-1.2.11:
6111400..d45d040 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit d45d040

Metadata Update from @nhosoi:
- Issue assigned to nhosoi
- Issue set to the milestone: 1.2.11.33

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1914

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata