#47988 Schema learning mechanism, in replication, unable to extend an existing definition
Closed: wontfix None Opened 6 years ago by tbordaz.

This bug was discovered in IPA upgrade 3.3 -> 4.1 (https://bugzilla.redhat.com/show_bug.cgi?id=1176995)

The problem is that when a supplier or consumer learns a superset definition it uses a MOD-ADD (see modify_schema_prepare_mods). But if the definition already existed, the internal modify will fail with LDAP_TYPE_OR_VALUE_EXISTS.

[16/Jan/2015:05:35:08 -0500] conn=Internal op=-1 MOD dn="cn=schema"
[16/Jan/2015:05:35:08 -0500] conn=Internal op=-1 RESULT err=20 tag=48 nentries=0 etime=0

In fact it should user MOD-DEL (old definition) + MOD-ADD (new definition).

A condition for this bug to create a replication issue is that the schema that is the superset of the other must have a 'nsschemaCSN' that is lower than the other schema

To reproduce, Create a two suppliers S1 and S2. Stop S2 and update S1 schema with:

( 2.16.840.1.113730.3.8xxxx NAME 'oc1' AUXILIARY MUST cn MAY ( description $ roomnumber ) X-ORIGIN ( 'Test' 'user defined' ) )

Stop S1 and update S2 schema with

( 2.16.840.1.113730.3.8xxxx NAME 'oc1' AUXILIARY MUST cn MAY ( description ) X-ORIGIN ( 'Test' 'user defined' ) )

Start S1 and S2. Do an update on any entry on S2.

So S1 schema > S2 schema, but the nsschemaCSN on S2 > S1.

So S2 will retrieve S1 schema, will detect that 'oc1' needs to be learned and will try to update its schema with this definition


The patch looks good to me. But I have 2 questions. :)
1) Did you run valgrind with your test? I'm especially curious about this +128 bytes, which ensures it's always enough...
6341 schema_oc_to_string(struct objclass oc)
6367 size += 128; /
for all keywords: NAME, DESC, SUP... */
Plus, in modify_schema_prepare_mods, you pass in object->old_value and object->new_value to berval and free bvps_del and bvps_add arrays. That looks correct. I just want to have the confirmation the function does not miss anything. :)

2) You switch the behaviour whether the op is internal or not. The "internal" is meant to be a "replicated" operation? I wonder if it could be an OP_FLAG_REPLICATED flag? (Or it behaves totally different?) My concern is the notion of "internal" sounds broader than what we are trying to solve. In other words, if we could say there is no "internal" schema delete operation other than this schema replication, I have no objection.

Replying to [comment:4 nhosoi]:

The patch looks good to me. But I have 2 questions. :)
1) Did you run valgrind with your test? I'm especially curious about this +128 bytes, which ensures it's always enough...
6341 schema_oc_to_string(struct objclass oc)
6367 size += 128; /
for all keywords: NAME, DESC, SUP... */
Plus, in modify_schema_prepare_mods, you pass in object->old_value and object->new_value to berval and free bvps_del and bvps_add arrays. That looks correct. I just want to have the confirmation the function does not miss anything. :)

Noriko, thank you so much for having look at that fix. You are right, I remember wanting to improve this but finally did not make it.

I did not run with valgrind and I will do it tomorrow. Now it depends how complex is the definition. This size can be computed with the sum of all the possible keywords supported by the function, but extra room could prevent a heap corruption.

2) You switch the behaviour whether the op is internal or not. The "internal" is meant to be a "replicated" operation? I wonder if it could be an OP_FLAG_REPLICATED flag? (Or it behaves totally different?) My concern is the notion of "internal" sounds broader than what we are trying to solve. In other words, if we could say there is no "internal" schema delete operation other than this schema replication, I have no objection.

This does not happen during a replicated operation. The replica agreements reads the remote schema, compares it with the local schema and if some definitions need to be updated (learn) it does an internal operation.
I may create a new flag if internal operation is not restrictive enough.

Regaring the internal flag, if it is exactly what you are looking for, no need to create a new flag.

Thank you for answering my questions, Thierry.

Running the test case with valgrind does not raise any heap corruption. I had to adapt the replication timeout because running with valgrind slow down significantly the servers.
I will do some slight changes around the string size (compute the size of the fixed sized keywords) and commit the change.

Thanks you again Noriko for this review :-)

Thank you for running more tests, Thierry! I'm so relieved to hear valgrind did not report any memory issues.

Please go ahead and push the patch to master and 1.3.3!

Pushed to master on behalf of Thierry.
d1f75af..51e05df master -> master
commit 51e05df

Pushed to 389-ds-base-1.3.3:
6431142..1073168 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit 1073168

Unfortunately, in schema_delete_objectclasses and schema_delete_attributes, ((mod->mod_bvalues == NULL) && is_internal_operation) case does not do anything but return SUCCESS (instead of UNWILLING_TO_PERFORM)? Is it expected?

Hello Noriko,

Sorry for this final rush, I was hoping to clear/push on monday but I understand it was not possible.
You are right this part of the fix is problematic. Removing all values attributeypes/objectclasses should return an UNWILLING TO PERFORM even in case of internal op.

Hopefully this is something that 'learning schema mechanism' is not doing, it just add and possibly delete specific values.

At this time I do not know how to handle this. Should I clear this part of the fix and push again or is it too late ?

Push of the test case

Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 186.08 KiB, done.
Total 9 (delta 2), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
cfa8e4d..868ac8b master -> master

Push testcase in 1.3.3

git push origin 389-ds-base-1.3.3
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (9/9), 186.87 KiB, done.
Total 9 (delta 3), reused 1 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
1bf51f8..50f3f5a 389-ds-base-1.3.3 -> 389-ds-base-1.3.3

Metadata Update from @tbordaz:
- Issue assigned to tbordaz
- Issue set to the milestone: 1.3.3 backlog

3 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1319

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

4 months ago

Login to comment on this ticket.

Metadata