#1259 Replicated CRL generation
Closed: migrated 2 years ago by dmoluguw. Opened 7 years ago by edewata.

Currently the CRL generation is only done by the master and there can be only a single master in the whole system. If the master becomes unavailable, another replica will have to be promoted to become a new master.

This mechanism has some issues:

  • The load is not evenly distributed.
  • There is a single point of failure in the system.
  • The failure may not be detected immediately.
  • The replica promotion is not automatic.
  • The CRL may not get updated for some time.

Ideally the master and replicas should be identical so there is no single point of failure. In that case the CRL generation should be done on all replicas. However, there are some concerns about that:

  • performance requirement
  • CRL inconsistency

About the performance requirement, assuming the replicas are using comparable machines there should not be an issue providing the same level of performance. If CRL generation is considered burdensome, it is an issue that has to be addressed separately regardless of this ticket.

About the CRL inconsistency, it's true that each replica may generate slightly different CRLs depending on the data it has at the time. However, CRLs are updated periodically, so assuming the database replication works fine all certificate revocation will eventually be propagated to all replicas and they will be included the next CRL update.

There may be some other concerns which require further discussions.

References:

Proposed milestone: 10.3


Regarding inclusion of an indicator of which replica generated a particular
CRL, we can use the Issuer Alternative Name extension for that:
http://tools.ietf.org/html/rfc5280#section-5.2.2

This is the long term approach for the FreeIPA seamless CRL master handling. It can be done in later release than #1262.

To make sure the CRL number is monotonically increasing as required by the spec, the CRL number should be generated based on the timestamp of the scheduled CRL generation time. So a client can retrieve the CRL from any replica and it should not see any conflicting CRLs with the same CRL number but different contents.

Per CS/DS meeting of 02/16/2015: 10.3

Related FreeIPA ticket: https://fedorahosted.org/freeipa/ticket/4911

Comments from latest discussion:

>> On Wed, 2015-02-18 at 13:30 +0100, Petr Spacek wrote:
...
>> I don't think we should include replica ID in the CRL number because the CRL
>> number should be a sequence number. If it includes replica ID, an application
>> switching between replicas may see the CRL number going up and down, it's no
>> longer monotonically increasing as required by the spec. If we need the
>> replica ID for debugging, we probably can put it in another field, but not in
>> CRL number.
>
> IMHO it is not a problem because from client's point of view these are simply
> different CLR versions with different content.
>
> Properly implemented client should notice 'I have seen this' and try again
> later. Please keep in mind that this 'collision' can happen only if both CRLs
> were generated in the same second!
>
> I believe that this is better option than having multiple CRLs with the same
> CRL number but different content. which is inevitable without replica-ID in
> the CRL number (unless we change replication protocol).

Yes and no Petr.

The problem here is that replica #2 at second N+1 may generate a CRL
that has *less* revocations that replica #1 did at second N.

The client would take replica #2 CRL (higher timestamp) and still lack
some of the revoked certs.

For CRLs consistency, we either need to generate them only on one
server, or, as you say, we need to augment the replication protocol such
that a server needs to get confirmatrion all the revoked certs has been
replicated before generating a CRL, or wait. The latter would be highly
problematic should one of the CRL generating replica fail to respond
(because it is down or unreachable).

I think for now we should just stick with generating CRLs on one server,
and maybe add diagnostics on other IPA servers to check and warn if the
CRL generator is unreachable.

Aside:
Perhaps we may make it so the other IPA servers can read the CRL and
serve it back on their own ? That would help for load balancing with
very many clients.

Simo.

Metadata Update from @edewata:
- Issue set to the milestone: UNTRIAGED

5 years ago

Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new
issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.

This issue has been cloned to GitHub and is available here:
https://github.com/dogtagpki/pki/issues/1821

If you want to receive further updates on the issue, please navigate to the
GitHub issue and click on Subscribe button.

Thank you for understanding, and we apologize for any inconvenience.

Metadata Update from @dmoluguw:
- Issue close_status updated to: migrated
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata