#816 pki-tomcat cannot be started after installation of ipa replica with ca
Closed: Fixed None Opened 5 years ago by lslebodn.

I tried to install ipa replica (from fedora 19 -> fedora 20)
with command
ipa-replica-install --setup-ca --setup-dns --no-forwarders /var/lib/ipa/replica-info-vm-171.idm.lab.example.com.gpg
but installation of replica failed, because certificate server instance could not be started

Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/19]: creating certificate server user
  [2/19]: configuring certificate server instance
  [3/19]: stopping certificate server instance to update CS.cfg
  [4/19]: disabling nonces
  [5/19]: set up CRL publishing
  [6/19]: starting certificate server instance

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

CA did not start in 240s

I think that the most important part from log files is:

bash# cat pki/pki-tomcat/ca/system
0.localhost-startStop-1 - [11/Dec/2013:17:42:55 CET] [3] [3] Cannot build CA chain. Error java.security.cert.CertificateException: Certificate is not a PKCS #11 certificate
0.localhost-startStop-1 - [11/Dec/2013:17:42:56 CET] [13] [3] authz instance DirAclAuthz initialization failed and skipped, error=Property internaldb.ldapconn.port missing value

Version of installed software:

bash# rpm -q freeipa-server pki-ca
freeipa-server-3.3.3-2.fc20.x86_64
pki-ca-10.1.0-1.fc20.noarch

Additional informations:
I was able to install ipa replica with ca (from fedora 19 -> fedora 19)

bash# rpm -q freeipa-server pki-ca
freeipa-server-3.3.3-2.fc19.x86_64
pki-ca-10.0.6-1.fc19.noarch

I will attach all log files and I can provide any other information to fix this bug.


All log files from directory /var/log/pki
pki_log.tar.gz

Do you need any other information?

I was unable to install replica between two VM with fully updated fedora 20

Just to confirm. You started with a brand new f20 freeipa instance, and then tried to clone a new freeipa instance (and it failed to restart). Is this something that is consistently repeatable?

According to the logs, it appears that the cloning operation is successful and that a new clone instance is created, but it then fails to restart because one of the self tests for the system certs fails.

Once the installation fails, you should still have a dogtag instance up on the clone. In this case, try to restart it manually.

systemctl restart ​pki-tomcatd@pki-tomcat.service

Also, what is the output of :

certutil -L -d /var/lib/pki/pki-tomcat/alias

and please attach /etc/pki/pki-tomcat/ca/CS.cfg

Thanks, Ade

Whats also useful to see is the catalina.out output for the server, which is written to the system journal.
To get this: journalctl -u pki-tomcatd@pki-tomcat.service

Also, once the install fails - and if you are unable to start the service -
then it may make sense to see if the server cert really is the problem.

You can do this by removing the "sslserver" bit from ca.cert.list in /etc/pki/pki-tomcat/ca/CS.cfg
So, replace :
ca.cert.list=signing,ocsp_signing,sslserver,subsystem,audit_signing
with:
ca.cert.list=signing,ocsp_signing,subsystem,audit_signing

This will disable the cert verification self test on startup. Then try to start manually.

You can confirm if the CA is up by doing a :
pki cert-find

Replying to [comment:4 vakwetu]:

Just to confirm. You started with a brand new f20 freeipa instance, and then tried to clone a new freeipa instance (and it failed to restart). Is this something that is consistently repeatable?

No, I want to create replica of IPA, because VM was created from old template and we want to remove get rid of old VMs.

I was not able to create replica from fedora19->fedora20. So, I decided to upgrade IPA master to fedora 20 and create replica from fedora20->fedora20

Yes, it is consistently repeatable.

According to the logs, it appears that the cloning operation is successful and that a new clone instance is created, but it then fails to restart because one of the self tests for the system certs fails.

Once the installation fails, you should still have a dogtag instance up on the clone. In this case, try to restart it manually.

systemctl restart ​pki-tomcatd@pki-tomcat.service

Yes, this works, but process of creating repoica is not finished and I don't want to do others steps myself.

Configuring certificate server (pki-tomcatd): Estimated time 3 minutes 30 seconds
  [1/19]: creating certificate server user
  [2/19]: configuring certificate server instance
  [3/19]: stopping certificate server instance to update CS.cfg
  [4/19]: disabling nonces
  [5/19]: set up CRL publishing
  [6/19]: starting certificate server instance

Your system may be partly configured.
Run /usr/sbin/ipa-server-install --uninstall to clean up.

CA did not start in 300s

Also, what is the output of :

certutil -L -d /var/lib/pki/pki-tomcat/alias

sh-4.2# certutil -L -d /var/lib/pki/pki-tomcat/alias

Certificate Nickname                                         Trust Attributes
                                                             SSL,S/MIME,JAR/XPI

caSigningCert cert-pki-ca                                    CTu,u,u
auditSigningCert cert-pki-ca                                 u,u,Pu
Server-Cert cert-pki-ca                                      u,u,u
ocspSigningCert cert-pki-ca                                  u,u,u
subsystemCert cert-pki-ca                                    u,u,u

and please attach /etc/pki/pki-tomcat/ca/CS.cfg

Thanks, Ade

journalctl -u pki-tomcatd@pki-tomcat.service > "pki-tomcatd@pki-tomcat.service.log"
pki-tomcatd@pki-tomcat.service.log

I changed line in CA.cfg
ca.cert.list=signing,ocsp_signing,sslserver,subsystem,audit_signing with: ca.cert.list=signing,ocsp_signing,subsystem,audit_signing

I restarted service and "pki-tomcatd@pki-tomcat.service" is active and running.

But If I try too see status of service via web
http://vm-140.idm.lab:8080/ca/admin/ca/getStatus
I can see

HTTP Status 500 - CS server is not ready to serve.

type Exception report

message CS server is not ready to serve.

description The server encountered an internal error that prevented it from fulfilling this request.

exception

java.io.IOException: CS server is not ready to serve.
    com.netscape.cms.servlet.base.CMSServlet.service(CMSServlet.java:443)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    java.lang.reflect.Method.invoke(Method.java:606)
    org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:277)
    org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:274)
    java.security.AccessController.doPrivileged(Native Method)
    javax.security.auth.Subject.doAsPrivileged(Subject.java:536)
    org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:309)
    org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:169)

note The full stack trace of the root cause is available in the Apache Tomcat/7.0.47 logs.

Ok - thanks for the info.

So, the correct way for us to reproduce this :
1) install freeipa on f19
2) update to f20
3) create a freeipa replica

We'll try to reproduce this - and hopefully will run into your issue. We'll update the ticket as soon as we have results.

Thanks for the patience.

Just a note, the problem can be reproduced by installing an F19 master and F20 replica. F19 replica works fine.

Endi, please provide the SSL server cert. Base64 encoding will do. thanks.

Just a preliminary report of my finding so far.

The Dogtag clone on F20 has a bad SSL server cert, just as the selftest reported.
The reason why it is bad is the following: The SSL server cert's issuer DN is encoded differently from the subject DN encoding of the CA signing cert.
Here is how the CA's subject DN is encoded (note the printable String):
C-Sequence (69)
C-Set (35)
C-Sequence (33)
Object Identifier (3)
2 5 4 10 (X520 Organization Name)
Printable String (26)
"IDM.LAB.ENG.BRQ.REDHAT.COM"
C-Set (30)
C-Sequence (28)
Object Identifier (3)
2 5 4 3 (X520 Common Name)
Printable String (21)
"Certificate Authority"
Here is how the clone's SSL server cert's issuer DN is encoded (Note the UTF8String)
C-Sequence (69)
C-Set (35)
C-Sequence (33)
Object Identifier (3)
2 5 4 10 (X520 Organization Name)
UTF8 String (26)
49 44 4d 2e 4c 41 42 2e 45 4e 47 2e 42 52 51 2e 52 45
44 48 41 54 2e 43 4f 4d
C-Set (30)
C-Sequence (28)
Object Identifier (3)
2 5 4 3 (X520 Common Name)
UTF8 String (21)
43 65 72 74 69 66 69 63 61 74 65 20 41 75 74 68 6f 72
69 74 79
They will appear to look the same in the certs, but see how the encoding are different. When that happens, certificate verification (specifically when building the trust chain) in NSS will fail.
That's why your clone server won't start.

As an experiment, I asked Endi to issue a new SSL server cert from the master ca (instead of the clone), and then import into the clone ca, then the clone ca started.
At one point, I suspected that even if such clone ca (with a new SSL cert issued from the master) might not issue certs correctly (given the wrong initial SSL server cert issuance), but another experiment proved that it actually did. Once the clone was up, I was able to verify successfully a new ssl server cert issued by this clone.

Currently I am asking Endi to try out a possible workaround in the CS.cfg. He will let us know if it works.

As to why it happened, I suspect it has to do with https://fedorahosted.org/pki/ticket/448, which would cause the default encoding to change from PrintableString to UTF8String. However, regardless of the change, the CA should never change its issuer DN encoding, so I am sure there is a bug in the CA.

Just to confirm, I added the following line to /usr/share/pki/ca/conf/CS.cfg before running ipa-replica-install on F20 per Christina's instruction, and the replica seems to be able to start and work properly.

X500Name.directoryStringEncodingOrder=PrintableString,UTF8String,T61String,BMPString,UniversalString

The replica's SSL server cert's issuer DN is encoded with PrintableString.

just for the record. I filed a ticket against Dogtag:
https://fedorahosted.org/pki/ticket/855 imported CA signing cert could cause violation of subject DN encoding rule during system cert generation at installation

I did some additional tests: http://pki.fedoraproject.org/wiki/Cloning_Tests.
It looks like the problem only affects the initial installation (before applying the workaround above).

I have shared the following two possible solutions (and my preference) with the CS team:

  1. (not permanent) automate the workaround (X500Name.directoryStringEncodingOrder param in CS.cfg) into (IPA/Dogtag) installation script(s). Please note that this is not a permanent solution as it makes assumption about what the CA signing cert's encoding was.

  2. (permanent and preferred) fix in Dogtag so that in the cloning procedure, submit the clone's SSL server cert CSR to the master CA instead. This will guarantee the correct encodings, and totally avoid asking the clone ca to issue certs in the time period when the clone is still unstable.

Choice could be based on how much time each solution will take to complete and how urgent IPA needs the fix.

I would prefer to have some fix in Fedora 20 soon as currently users on Fedora 20 simply cannot install clones... Like in this thread on freeipa-users: http://www.redhat.com/archives/freeipa-users/2014-February/msg00084.html

Cloned pki-tomcat cannot be started after installation of ipa replica with ca (GUI) ticket from this ticket to separate out the GUI installation issues.

Also, as a part of fixing this ticket, we will change the default value of 'pki_clone_replicate_schema' from 'True' to 'False' in order to avoid schema replication issues which occur on the clone-side (although this will not affect IPA since they override setting this variable to 'False'). This issue will be re-addressed in the cloned ticket.

DOGTAG_10_1_BRANCH:

  • e45bde1109190743a40516af88b1262f532d1bdd

master:

  • 81af8ae1230e8447e7cc0d24ea9a9b8dadf6c08d

Metadata Update from @lslebodn:
- Issue assigned to mharmsen
- Issue set to the milestone: 10.1.1

2 years ago

Login to comment on this ticket.

Metadata