https://bugzilla.redhat.com/show_bug.cgi?id=787743 (Red Hat Directory Server)
Description of problem:
When upgrading from:
(including all appropriate dependencies)
In a MMR replicated environment, we have seen 4 cases of existing password
policies no longer being "sticky". After the upgrade, the password polices
seem to "disappear" from the MMR environment. They can be re-added, either by
cli or console. They will then "disappear" again. Standard logging shows
nothing. The one customer that did enable debug level logging found that the
issue "went away" - the password policy stayed in the MMR environment as long
as debug levels were enabled.
3 of the customers who have reported this basically "started over" with their
environment before any concrete diagnosis was made.
Note: this is not bz732153. that has been verified.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
This has never been repoduced by FL, TAM, or SEG in many attempts.
Based on the most recent log analysis, what may be happening is this:
all of these use the console
1) admin creates a per-user password policy - verify an external ldapsearch on the user entry has the pwdpolicysubentry attribute
2) admin creates a subtree password policy at the subtree level of the subtree containing the user entry
verify an external ldapsearch on the user entry does not contain pwdpolicysubentry, or contains the one from the subtree instead of the per user policy entry
I can not reproduce the issue with the latest console/server...
I found the code that is doing the deletes:
// Remove the common spec entry
ldc.delete( _policyspecdn );
// Remove the subtree CoS entries
ldc.delete( _costemplatedn );
ldc.delete( _cosspecdn );
// Delete container entry
ldc.delete( _containerdn )
[30/Sep/2011:09:07:41 +0200] conn=525097 op=49 DEL dn="cn=cn\5c3DnsPwPolicyEntry\5c2Cou\5c3DPeople\5c2Co\5c3Drabobank\5c2Cc\5c3Dnl,cn=nsPwPolicyContainer,ou=People,o=rabobank,c=nl"
[30/Sep/2011:09:07:41 +0200] conn=525097 op=50 DEL dn="cn=cn\5c3DnsPwTemplateEntry\5c2Cou\5c3DPeople\5c2Co\5c3Drabobank\5c2Cc\5c3Dnl,cn=nsPwPolicyContainer,ou=People,o=rabobank,c=nl"
[30/Sep/2011:09:07:41 +0200] conn=525097 op=51 DEL dn="cn=nsPwPolicy_CoS,ou=People,o=rabobank,c=nl"
[30/Sep/2011:09:07:41 +0200] conn=525097 op=52 DEL dn="cn=nsPwPolicyContainer,ou=People,o=rabobank,c=nl"
On the latest console this only happens in one place in the code, and it can only be called when you turn password policy on or off. I haven't been able to trigger this delete operation though. I'm going to see if the code was different in version 8.2. Maybe we doing an additional delete somewhere else where we shouldn't be.
I reproduced the problem, but its not a bug(yet).
Goto "manage password policy" on a subtree, and uncheck the "Fine-Grained Subtree Policy enabled" checkbox, and then save it. This will delete the cos & password policy entries. If you "enable" it, it adds the policies right back.
My access log show the exact same deletes as the customer, and when I re-enabled it, it adds the entries/policy back:
[06/Mar/2012:15:30:02 -0500] conn=1 op=212 DEL dn="cn=cn\3DnsPwPolicyEntry\2Cou\3Dtest2\2Cou\3DPeople\2Cdc\3Dexample\2Cdc\3Dcom,cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"0
[06/Mar/2012:15:30:02 -0500] conn=1 op=213 DEL dn="cn=cn\3DnsPwTemplateEntry\2Cou\3Dtest2\2Cou\3DPeople\2Cdc\3Dexample\2Cdc\3Dcom,cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:02 -0500] conn=1 op=214 DEL dn="cn=nsPwPolicy_CoS,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:02 -0500] conn=1 op=215 DEL dn="cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:47 -0500] conn=1 op=220 ADD dn="cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:47 -0500] conn=1 op=221 ADD dn="cn=cn\3DnsPwPolicyEntry\2Cou\3Dtest2\2Cou\3DPeople\2Cdc\3Dexample\2Cdc\3Dcom,cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:47 -0500] conn=1 op=222 ADD dn="cn=cn\3DnsPwTemplateEntry\2Cou\3Dtest2\2Cou\3DPeople\2Cdc\3Dexample\2Cdc\3Dcom,cn=nsPwPolicyContainer,ou=test2,ou=People,dc=example,dc=com"
[06/Mar/2012:15:30:47 -0500] conn=1 op=223 ADD dn="cn=nsPwPolicy_CoS,ou=test2,ou=People,dc=example,dc=com"
The only way to trigger this, is under the "subtree" password policy panel, and unchecking the "Fine-Grained Subtree Policy" checkbox. The customer must be doing this(accidentally?).
Any news on this? Can it be closed?
Customer is saying they aren't using the console, but all the logs I've seen show the console doing the deletes.
Asked for a fresh set of access logs, still waiting...
I'm working on setting up a system to work on this as well. While looking through the source code I found where we look for the pwdpolicysubentry entry. If you turn on trace function calls logging, we should get some very valuable information if the customer reproduces the issue.
If the customer can do this, please pass the error log on to me.
I was finally able to reproduce the problem, took 30 minutes running 4 "clients".
Once I was able to reproduce the errors, I stopped all the clients and then I run one add, and it still failed. So it takes load to trigger the issue, but once it happens, the DS is hosed until a restart.
The cos cache is probably the culprit, but I need to reproduce this with a debug build to find out exactly what is triggering the problem. Could be several things, need to investigate further.
reproducible testcase - Note: run bulkregister_clear instead of bulkregister
I wrote a test fix for 8.2, that is actually the cos plugin from the master branch backported to 8.2. So far the customer, support, and myself have not been able to reproduce the problem with the new plugin.
So this issue appears to be fixed in the latest version of DS.
I have been looking at the bugs that have been fixed since 8.2(using git log), but I can not pinpoint an exact single fix that addresses this. There are many coverity fixes. This means that the official customer fix would require backporting the entire cos plugin from master. It would nice to just backport one change/diff, but I'm not sure its worth all the work of testing each fix to find out which one(s) solved the issue. I will discuss this further with the team.
For now we know we have a fix, and are we moving forward.
Can we close this ticket since the issue doesn't affect 1.2.11?
originally targeted for 1.2.11.rc1, but actually in the 1.2.11.a1 release
I have run the acceptance tests and valgrind against 8.2 with the hotfix. Everything has passed. Hotfix can be used until the next errata can be installed.
Closing ticket as "worksforme" since this has been fixed in the latest version.
Added initial screened field value.
Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.2.11.a1
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here:
If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Invalid)
to comment on this ticket.