In a replication environment, if the changelog db file name contains extension string multiple times in the file name, the change log file is getting recreated if we perform the db2ldif and ldif2db on the master/hub instance.
Ex: d736e482-198111e1-8d7bedb4-8c53b85f_502ce263000000020000.db4 In this file name "db4" is present twice, once as the extension and other one is in the replica name string ("8d7bedb4").
There is a logic problem in the below function where it is trying to find the filename ends with extension. It calls strstr()function to search the "ext" and which returns the first occurrence of the "ext" string in the filename. if the the "ext" string exist multiple times in the file name it returns false always, which result in creating multiple changelog db file.
==== filename: cl5_api.c
/ - return 1: true (the "filename" ends with "ext") - return 0: false / static int _cl5FileEndsWith(const char filename, const char ext) { char *p = NULL; int flen = strlen(filename); int elen = strlen(ext); if (0 == flen || 0 == elen) { return 0; } p = strstr(filename, ext); if (NULL == p) { return 0; } if (p - filename + elen == flen) { return 1; } return 0; } =====
I have modified this function to fix this issue. Could you please verify the same and include the fix in the master branch?
/ - return 1: true (the "filename" ends with "ext") - return 0: false / static int _cl5FileEndsWith(const char filename, const char ext) { char *p = NULL; int flen = strlen(filename); int elen = strlen(ext); if (0 == flen || 0 == elen) { return 0; } p = strstr(filename, ext); if (NULL == p) { return 0; }
do { if (p - filename + elen == flen) { return 1; } p = strstr(p+elen, ext); } while ( p != NULL ); return 0;
Thanks and Regards, Jyoti
Hi,
Can anyone please verify this fix?
Thanks in advance.
Regards, Jyoti
Here is the current status
{{{ from: c7c6377c-196e11e3-831c8895-1f2ce016 to: c7c6377c-196e11e3-831c88db-1f2ce016 (change '95' -> 'db' in the 3rd component) }}}
Then started DS, I can see the logs:
{{{ [09/Sep/2013:16:37:34 +0200] NSMMReplicationPlugin - changelog program - _cl5AppInit: fetched backend dbEnv (1efff10) [09/Sep/2013:16:37:34 +0200] NSMMReplicationPlugin - changelog program - _cl5DBOpen: opened 0 existing databases in /var/lib/dirsrv/slapd-master/changelogdb [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - replica_add_by_dn: added dn (dc=com) [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - _replica_configure_ruv: No ruv tombstone found for replica dc=com. Created a new one [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - replica_delete_by_dn: removed dn (dc=com) [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - changelog program - _cl5GetDBFile: no DB object found for database /var/lib/dirsrv/slapd-master/changelogdb/4ade9183-195d11e3-831cdb94-1f2ce016_522ddd3f000000010000.db [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - changelog program - cl5GetOperationCount: could not get DB object for replica [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - changelog program - _cl5GetDBFile: no DB object found for database /var/lib/dirsrv/slapd-master/changelogdb/4ade9183-195d11e3-831cdb94-1f2ce016_522ddd3f000000010000.db [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - changelog program - cl5GetOperationCount: could not get DB object for replica [09/Sep/2013:16:37:51 +0200] NSMMReplicationPlugin - changelog program - _cl5GetDBFile: no DB object found for database /var/lib/dirsrv/slapd-master/changelogdb/4ade9183-195d11e3-831cdb94-1f2ce016_522ddd3f000000010000.db }}}
I was unsure of the reported test case. In fact except those errors, db2ldif (master) followed by ldif2db (hub) worked and after restart, replication was also running well
I created a test case where replication skip updates I do not know if it is the reported issue, but it is the one I will use as a test case.
{{{ Create Master, C1, C2 Update nsDS5ReplicaName on Master, so that it contains 'db' (my database suffix. It can be db3 or db4). Create user t1 Create user t2 <check replication is working> Stop C2 Create user t3 <check t3 is replicated on C1> Stop Master, C1 export Master (-r) import C1 (this step can likely be skipped) Start Master, C1, C2 Create user t4
-> On Master: t1, t2, t3, t4 -> On Cons.1: t1, t2, t3, t4 -> On Cons.2: t1, t2, t4
}}}
{{{ dbid: 0000006f000000000000 entry count: 7
dbid: 000000de000000000000 purge ruv: {replicageneration} 522dfa79000000010000 {replica 1 ldap://pctbordaz.redhat.com:47489} dbid: 0000014d000000000000 max ruv: {replicageneration} 522dfa79000000010000 {replica 1} 522dfb19000000010000 522dfd1b000000010000 dbid: 522dfb19000000010000 uniqueid: 31464581-196f11e3-831cdb94-1f2ce016 dn: uid=t1,dc=com operation: add dbid: 522dfb38000000010000 uniqueid: 31464582-196f11e3-831cdb94-1f2ce016 dn: uid=t2,dc=com operation: add dbid: 522dfb6c000000010000 <<<<<< broken entry uniqueid: 00000000-00000000-00000000-00000000 dn: cn=start iteration operation: delete dbid: 522dfcc6000000010000 uniqueid: 2809a881-197011e3-831cdb94-1f2ce016 dn: uid=t4,dc=com operation: add
Here are the next steps
- I will verify the fix
attachment 0001-Ticket-47489-Under-specific-values-of-nsDS5ReplicaNa.patch
Can we get this fix into RHEL 6.5? Does this affect 389-ds-base-1.2.11?
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1007452
Thanks Rich for the review.
At the source level, it applies on 1.2.11. I will test and confirm if I can reproduce on 1.2.11
I confirm the same bug applies on 389-ds-base-1.2.11. I can reproduce the skipped updates with the same test case, the only difference is that in 1.2.11 database suffix is 'db4' and 'nsDS5ReplicaName' should contain 'db4' to reproduce the issue.
Push to master:
git merge ticket47489
Updating b73f1e8..7a7609d Fast-forward ldap/servers/plugins/replication/cl5_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
git push origin master
Counting objects: 13, done. Delta compression using up to 4 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 1.05 KiB, done. Total 7 (delta 5), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git b73f1e8..7a7609d master -> master
commit 7a7609d Author: Thierry bordaz (tbordaz) tbordaz@redhat.com Date: Wed Sep 11 11:08:58 2013 +0200
389-ds-base-1.3.1 branch: commit ac8aad8 389-ds-base-1.2.11 branch: commit f944cd0
Metadata Update from @tbordaz: - Issue assigned to tbordaz - Issue set to the milestone: 1.3.2 - 09/13 (September)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/826
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.