#47362 ipa upgrade selinuxusermap data not replicating
Closed: wontfix None Opened 10 years ago by nkinder.

When I upgrade an IPA replica before the master, I don't see selinuxusermap
entries added on master replicated to replica.

Version-Release number of selected component (if applicable):
389-ds-base-1.2.11.15-11.el6.x86_64
ipa-server-3.0.0-26.el6_4.2.x86_64

Steps to Reproduce:

  1. On RHEL6.2, install IPA server, then replica, then client
  2. add repo config pointing to a rhel6.4 repo on all servers
  3. Upgrade IPA client with yum update ipa-server
  4. Upgrade IPA replica with yum update ipa-server
  5. Upgrade IPA master with yum update ipa-server
    On MASTER:
  6. ipa user-add seluser1 --first=sel --last=user1
  7. ipa selinuxusermap-add --hostcat=all --selinuxuser=staff_u:s0-s0:c0.c1023
    serule1
  8. ipa selinuxusermap-add-user --users=seluser1 serule1
    On REPLICA:
  9. ipa selinuxusermap-show serule1

Actual results:
Not able to see selinuxusermap entried on replica that was created on master.

[root@ipaqavmc slapd-TESTRELM-COM]# ipa selinuxusermap-show serule1
ipa: ERROR: serule1: SELinux User Map rule not found

Expected results:
Should see it on replica, same as master:

[root@ipaqavmb slapd-TESTRELM-COM]# ipa selinuxusermap-show serule1
Rule name: serule1
SELinux User: staff_u:s0-s0:c0.c1023
Host category: all
Enabled: TRUE
Users: jordan

The directory server supplier is sending over the same changes twice. The reason is because the RUV returned from the consumer (the Consumer RUV in the supplier error log) is bogus - the RUV element for the supplier (rid 4) is empty and even has the wrong port number in the pURL. The RUV element for the supplier should contain the max CSN of the most recent changes sent over.

This is a case of a duplicate ADD - the entries were added directly to the replica earlier:

[14/May/2013:12:57:00 -0400] conn=7 op=25 ADD dn="cn=selinux,dc=testrelm,dc=com"
[14/May/2013:12:57:00 -0400] conn=7 op=25 RESULT err=0 tag=105 nentries=0 etime=0 csn=51926cdd000000030000

The replica was unable to send this change to the master because there was a problem with replication:

[14/May/2013:12:57:00 -0400] slapi_ldap_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: error -2 (Local error)
[14/May/2013:12:57:00 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Replication bind with GSSAPI auth failed: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Cannot determine realm for numeric host address))

replication resumes:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Replication bind with GSSAPI auth resumed
schema repl issue:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): Warning: unable to replicate schema: rc=1

The above add, and several other changes, appear to be missing from the supplier RUV:

[14/May/2013:12:59:12 -0400] - _cl5PositionCursorForReplay (agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389)): Supplier RUV:
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replicageneration} 51926881000000040000
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replica 3 ldap://ipaqavma.testrelm.com:389} 51926d51000000030000 51926d61000000030000 51926d60

Note that the min csn in the RUV element for the replica (rid 3) is 51926d51000000030000, which is greater than 51926cdd000000030000. But the consumer has this:

[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): {replica 3 ldap://ipaqavma.testrelm.com:389} 51926887000800030000 51926cc4000000030000 00000000

51926cc4000000030000 is less than 51926cdd000000030000, so the master has not seen that change yet.

The replica attempts to replay these changes to the master:
[14/May/2013:12:59:12 -0400] agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389) - session start: anchorcsn=51926cc4000000030000
[14/May/2013:12:59:12 -0400] agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389) - clcache_load_buffer: rc=-30988
[14/May/2013:12:59:12 -0400] NSMMReplicationPlugin - changelog program - agmt="cn=meToqe-blade-09.testrelm.com" (qe-blade-09:389): CSN 51926aa6000000040000 not found and no purging, probably a reinit

So, for some reason, the replica doesn't have 51926cc4000000030000. This is a change which originated on the replica (rid 3) - not sure why it isn't found. Since it isn't found, it tries to use the min csn from the supplier, which is 51926d51000000030000, which skips the changes made earlier.


This causes the changelog to be wiped out:
{{{
[14/May/2013:12:56:50 -0400] NSMMReplicationPlugin - ruv_compare_ruv: the max CSN [51926cc4000000030000] from RUV [changelog max RUV] is larger than the max CSN [] from RUV [database RUV] for element [{replica 3} 51926887000800030000 51926cc4000000030000]
[14/May/2013:12:56:50 -0400] NSMMReplicationPlugin - replica_check_for_data_reload: Warning: data for replica dc=testrelm,dc=com does not match the data in the changelog. Recreating the changelog file. This could affect replication with replica's consumers in which case the consumers should be reinitialized.
}}}
I have no idea why the database RUV is empty. The RUV element in this case is from the changelog RUV.

The ruv element for replica4 seems to be changing in two steps

1] changing the port to 0
[14/May/2013:13:10:07 -0400] NSMMReplicationPlugin - agmt="cn=meToipaqavma.testrelm.com" (ipaqavma:389): {replica 4 ldap://qe-blade-09.testrelm.com:0} 51926fef000000040000 51926ff0000400040000 51926fef

2] removing the csns
[14/May/2013:13:10:13 -0400] NSMMReplicationPlugin - agmt="cn=meToipaqavma.testrelm.com" (ipaqavma:389): {replica 4 ldap://qe-blade-09.testrelm.com:0}

ruv_compare_ruv seems to assume that the ruv elements are in the same order, maybe it compares 3 to empty 4.

Why does the port get changed to 0 ? I remember this could happen when a replica is demoted from master to hub/consumer, but couldn't find that in RHDS

I can reproduce the behavior, and it does seem to be related to the fact that the nsslapd-port is changed to 0. MMR constructs a local purl (Partial URL) like this:
{{{
multimaster_set_local_purl()
local_purl = slapi_ch_smprintf("ldap://%s:%s", config_get_localhost(), config_get_port());
}}}
since port is 0, the purl looks like "ldap://hostname:0"
The code in ruv_init_from_slapi_attr_and_check_purl() tries to make sure the purl in the RUV element matches the server's local purl. In this case, it doesn't match any more, since the port number has changed, so the RUV element is reset, and the min and max csn are wiped out.

I think in this case, we should check only the hostname, not the port number.

76c87bd..0c194eb 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit 0c194eb
Author: Rich Megginson rmeggins@redhat.com
Date: Wed May 15 19:39:24 2013 -0600
ce102a9..2777aef 389-ds-base-1.3.0 -> 389-ds-base-1.3.0
commit 2777aef
Author: Rich Megginson rmeggins@redhat.com
Date: Wed May 15 19:39:24 2013 -0600
3da40b4..2909b17 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit 2909b17
Author: Rich Megginson rmeggins@redhat.com
Date: Wed May 15 19:39:24 2013 -0600
f2b5a97..6236d7a master -> master
commit 6236d7a
Author: Rich Megginson rmeggins@redhat.com
Date: Wed May 15 19:39:24 2013 -0600

Metadata Update from @rmeggins:
- Issue assigned to rmeggins
- Issue set to the milestone: 1.2.11.22

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/699

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata