#48208 cleanallruv should completely clean changelog
Closed: wontfix None Opened 8 years ago by mreynolds.

Currently cleanallruv triggers change log trimming, but this potentially does not remove all the changelog entries that came from the invalid rid(or the cleaned rid). There are certain conditions, like the server crashing and the changelog being rescanned, that can cause the db RUV to get polluted with the invalid rid again. The RUV element for the invalid rid is usually missing the URL as well, which causes issues for FreeIPA tools.


Found an issue if we run out of locks, need to redo patch even though I did not run out of locks with this patch.

A very minor thing, but an error message in _cl5PurgeGetFirstEntry prints a different function name.
{{{
3631 slapi_log_error(SLAPI_LOG_FATAL, repl_plugin_name_cl,
3632 "_cl5GetFirstEntry: failed to get entry; db error - %d %s\n",
3633 rc, db_strerror(rc));
}}}

Out of my curiousity... If cl5PurgeGetFirstEntry returns an error and retries, first_pass is supposed to be 0 or 1? In other words, at the first time cursor needs to be set at the first position (DB_NEXT == DB_FIRST) by the line 3591 while ((rc = cursor->c_get(cursor, key, &data, rid?DB_SET:DB_NEXT)) == 0) in _cl5PurgeGetFirstEntry? If it returns an error, the cursor is not set at the first position yet? Do we want to repeat it with first_pass == 1 or we should not?
{{{
3763 /
3764 * Check every changelog entry for the cleaned rid
3765
/
3766 rc = _cl5PurgeGetFirstEntry(obj, &entry, &iterator, txnid, first_pass?0:cleaned_rid, &key);
3767 first_pass = 0;
3768 while (rc == CL5_SUCCESS && !slapi_is_shutting_down()) {
}}}

Otherwise, you have my ack.

two comments:

does this improvement apply only to cleanallruv ? we have seen cases where customer were only successful in issuing "parallel" the simple cleanruv task. I think the cleaning functionality should be the same in both, only the task propagation and management should be in cleanallruv

is this iteration code a duplicate of the changelog iteration for replay, could this be unified ?

Replying to [comment:7 lkrispen]:

two comments:

does this improvement apply only to cleanallruv ? we have seen cases where customer were only successful in issuing "parallel" the simple cleanruv task. I think the cleaning functionality should be the same in both, only the task propagation and management should be in cleanallruv

Good point, this can easily be added to the original CLEANRUV task

is this iteration code a duplicate of the changelog iteration for replay, could this be unified ?

It is not duplicate that I'm aware of. However, I have not looked into this from a replication replay perspective. It looks like the changelog cache(clcache_load_buffer) does use DB_SET to set a starting point, but this needs more investigation and perhaps a new ticket.

Replying to [comment:6 nhosoi]:

A very minor thing, but an error message in _cl5PurgeGetFirstEntry prints a different function name.

Thanks for pointing that out!

{{{
3631 slapi_log_error(SLAPI_LOG_FATAL, repl_plugin_name_cl,
3632 "_cl5GetFirstEntry: failed to get entry; db error - %d %s\n",
3633 rc, db_strerror(rc));
}}}

Out of my curiousity... If cl5PurgeGetFirstEntry returns an error and retries, first_pass is supposed to be 0 or 1? In other words, at the first time cursor needs to be set at the first position (DB_NEXT == DB_FIRST) by the line 3591 while ((rc = cursor->c_get(cursor, key, &data, rid?DB_SET:DB_NEXT)) == 0) in _cl5PurgeGetFirstEntry? If it returns an error, the cursor is not set at the first position yet? Do we want to repeat it with first_pass == 1 or we should not?

If it fails to get the first entry the entire process exits. I'm not sure there is a reason to retry getting the first entry.

{{{
3763 /
3764 * Check every changelog entry for the cleaned rid
3765
/
3766 rc = _cl5PurgeGetFirstEntry(obj, &entry, &iterator, txnid, first_pass?0:cleaned_rid, &key);
3767 first_pass = 0;
3768 while (rc == CL5_SUCCESS && !slapi_is_shutting_down()) {
}}}

Otherwise, you have my ack.

revision #2 - add changelog purging to both cleanruv and cleanallruv
0001-Ticket-48208-CleanAllRUV-should-completely-purge-cha.patch

bca0908..ff1c345 master -> master
commit ff1c345
Author: Mark Reynolds mreynolds@redhat.com
Date: Wed Jul 8 11:48:27 2015 -0400

a741911..9e4cf12 389-ds-base-1.3.4 -> 389-ds-base-1.3.4
commit 9e4cf12

Hi Mark,
Could you please backport the patch to 1.2.11, as well?
Thanks!

4a4a7ed..264f672 389-ds-base-1.3.3 -> 389-ds-base-1.3.3
commit 264f672
Author: Mark Reynolds mreynolds@redhat.com
Date: Wed Jul 8 11:48:27 2015 -0400

8449e3b..98def84 389-ds-base-1.3.2 -> 389-ds-base-1.3.2
commit 98def84

4eb33f3..e8803f5 389-ds-base-1.3.1 -> 389-ds-base-1.3.1
commit e8803f5ad77ec742c57c0121dfc83822633ab602

1e7e99d..a1dc207 389-ds-base-1.2.11 -> 389-ds-base-1.2.11
commit a1dc207

Replying to [comment:13 nhosoi]:

Hi Mark,
Could you please backport the patch to 1.2.11, as well?
Thanks!

Done!

Metadata Update from @mreynolds:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.4.2

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1539

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata