#6847 ipa_repl_version hard coded data version and "Incompatible IPA versions, pausing replication. "
Opened 6 years ago by msauton. Modified 6 years ago

we have seen those errors a few times in several cases on RHEL-7.3 in the LDAP server errors log file without any other event when replication verbose log is disabled:

[28/Mar/2017:22:42:30.764332032 -0700] repl_version_plugin_recv_acquire_cb - [file ipa_repl_version.c, line 119]: Incompatible IPA versions, pausing replication. This server: "20100614120000" remote server: "(null)".

per
./daemons/ipa-slapi-plugins/ipa-version/ipa_repl_version.c
it seem like a replica would go in incremental backoff mode, after a replication session failure, so it would not be a fatal replication session error.

  • This is called on a replica when it receives a start replication
  • extended operation from a master.
    *
  • The data sent by the master (version) is compared with our own
  • hardcoded version to determine if replication can proceed or not.
    *
  • The replication plug-in will take care of freeing data_guid and data.
    *
  • Returning non-0 will abort the replication session. This
  • results in the master going into incremental backoff mode.
    /
    static int
    repl_version_plugin_recv_acquire_cb(const char
    repl_subtree, int is_total,
    const char data_guid, const struct berval data)
    {

the root cause seem to be with a hardcoded data version value in the plug-in
./daemons/ipa-version.h

define DATA_VERSION 20100614120000

Although this does not seem to be a fatal error in all the replication sessions, it is seen as a fatal error in the IPA replication plug-in:
./daemons/ipa-slapi-plugins/ipa-version/ipa_repl_version.c

    if (!(strcmp(data_version, data->bv_val) == 0)) {
        LOG_FATAL("Incompatible IPA versions, pausing replication. "
                  "This server: \"%s\" remote server: \"%s\".\n",
                  data_version, data->bv_val);
        return 1;

Although this may be ignored, I am not sure what may be the effect of not having a full init and only incremental updates, we need at least one full init a first time.
We should probably fix this so there are no suspicious important LDAP replication error messages with an outdated timestamp in years and a null value of a remote server.
LDAP replication is critical, and this would help clear false positives when trying to troubleshoot replication issues.


I opened a RH bz to eventually track this for RHEH-7.x , but could not edit the rhbz field for customer reference in this ticket.
bz 1439340 - ipa_repl_version hard coded data version and "Incompatible IPA versions, pausing replication. "

Metadata Update from @pvoborni:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1439340

6 years ago

Metadata Update from @pvoborni:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1439340

6 years ago

@tbordaz AFAIK ipa_reppl_version plugin doesn't do anything useful, right? It was build long time ago to ensure that only compatible ipa versions are replicated, but the version was never bumped.

I looked into it and didn't find why the version would not match, it is hardcoded on both sides. It could be a race between the version initialization and the first agreement opening a replication session.

But since the error is not fatal, replication resumes, and that we want to remove the plugin I did no further investigation

Metadata Update from @pvoborni:
- Issue set to the milestone: Future Releases

6 years ago

Login to comment on this ticket.

Metadata