#8210 Nightly test failure in test_integration/test_advise.py::TestAdvice::test_advice_config_server_for_smart_card_auth : timeout waiting for KDC cert
Closed: duplicate 2 months ago by frenaud. Opened 2 months ago by frenaud.

The nightly test fedora-30/test_advise failed in test_integration/test_advise.py::TestAdvice::test_advice_config_server_for_smart_card_auth in PR #4289 with the following logs.

The test had 2 failures:
- during the ipa-server-install phase, the KDC cert could not be issued:

2020-02-29T15:48:57Z WARNING PKINIT certificate request failed: Certificate issuance failed (TIMEOUT: None)
2020-02-29T15:48:57Z WARNING Failed to configure PKINIT
2020-02-29T15:48:57Z DEBUG Full PKINIT configuration did not succeed
2020-02-29T15:48:57Z DEBUG The setup will only install bits essential to the server functionality
2020-02-29T15:48:57Z DEBUG You can enable PKINIT after the setup completed using 'ipa-pkinit-manage'
  • Then during the test, ipa-advise config_server_for_smart_card_auth failed in the Pkinit enable phase:
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] Configuring Kerberos KDC (krb5kdc)
[ipatests.pytest_ipa.integration.host.Host.master.cmd86]   [1/1]: installing X509 Certificate for PKINIT
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] PKINIT certificate request failed: Certificate issuance failed (TIMEOUT: None)
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] Failed to configure PKINIT
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] Full PKINIT configuration did not succeed
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] The setup will only install bits essential to the server functionality
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] You can enable PKINIT after the setup completed using 'ipa-pkinit-manage'
[ipatests.pytest_ipa.integration.host.Host.master.cmd86]   [error] RuntimeError: Certificate issuance failed (TIMEOUT: None)
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] Certificate issuance failed (TIMEOUT: None)
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] The ipa-pkinit-manage command failed.
[ipatests.pytest_ipa.integration.host.Host.master.cmd86] Failed to issue PKINIT certificates to local KDC

A quick look at the journal shows that the cert was issued but stuck in post_save. There is no log showing that the krb5kdc service was restarted, either certmonger did not call the post_save command or it got stuck waiting for the renewal lock.
In the 2nd try, a different error is seen:

Feb 29 15:52:23 master.ipa.test certmonger[19477]: 2020-02-29 15:52:23 [23620] Setting "CERTMONGER_CA_PROFILE" to "KDCs_PKINIT_Certs" for child.
Feb 29 15:52:23 master.ipa.test certmonger[19477]: 2020-02-29 15:52:23 [23620] Redirecting stdin and stderr to /dev/null, leaving stdout open for child "/usr/libexec/certmonger/ipa-server-guard".
Feb 29 15:57:33 master.ipa.test certmonger[19477]: 2020-02-29 15:57:33 [23640] error:0D06407A:asn1 encoding routines:a2d_ASN1_OBJECT:first num too large

I think it really did just timeout for some reason, I assume lock contention on the renewal lock.

The ipa-server-install step timed out at 15:48:57

2020-02-29T15:48:57Z WARNING PKINIT certificate request failed: Certificate issuance failed (TIMEOUT: None)

But the KDC was restarted at 15:49:03

Feb 29 15:49:03 master.ipa.test audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=krb5kdc comm="systemd" exe="/usr/lib/systemd/sys....

I can't be absolutely sure that it was the certmonger post-command script that did the restart but it sure looks like it.

For the second failure I don't know why the asn1 parsing failed. I copied the logged CSR and it parses just fine using openssl req.

As for the "first num too large" error, I'm not yet sure why it happens, but here is a clue:

The "first num too large" error is not from ASN.1 parsing - it arises from a2d_ASN1_OBJECT which converts string representation to internal OBJECT IDENTIFIER representation, when the first number in the OID is not 0, 1 or 2.

a2d_ASN1_OBJECT is called from OBJ_txt2obj, which is used by Certmonger in ldap_dn_to_X509_NAME(char *s) (src/csrgen-o.c:83), which is called by cm_csrgen_o_main:

283  subject = ldap_dn_to_X509_NAME(entry->cm_template_subject);                   

So, maybe there is something not quite right in the tracking request configuration...

Looking at the journal I see:

Feb 29 15:52:23 master.ipa.test certmonger[19477]: 2020-02-29 15:52:23 [23620] Setting "CERTMONGER_REQ_SUBJECT" to "O=IPA.TEST,cn=master.ipa.test" for child.

I have a hunch the lower-case cn= is the cause of the problem.

How to find out exactly which certmonger package is installed on the CI machine?

I actually see this error on my logs on my machine (f31). It doesn't seem to be ldap_dn_to_X509_NAME. Some other chain of calls must be invoking a2d_ASN1_OBJECT.

@frenaud @rcritten after further investigation I believe the error:

error:0D06407A:asn1 encoding routines:a2d_ASN1_OBJECT:first num too large

is a false alarm. The failure of a2d_ASN1_OBJECT is expected when the attribute shortname (e.g. cn) is not upper case. Certmonger then converts the string to upper case and tries again.
Unfortunately, a2d_ASN1_OBJECT logs the error and it cannot easily be suppressed.

So I would suggest ignoring the above error for now.

@ftweedal

How to find out exactly which certmonger package is installed on the CI machine?

In the logs directory there is a installed_packages dir that contains the list of all pkgs installed on controller/master/replica/client.
In this specific test certmonger-0.79.8-2.fc30.x86_64 was used.

Same issue happened in PR 4396, see logs

Also reported as #8252, hence closing as duplicate

Metadata Update from @frenaud:
- Issue close_status updated to: duplicate
- Issue status updated to: Closed (was: Open)

2 months ago

Login to comment on this ticket.

Metadata