#2878 Missing faillure resumption detection and audit event logging at startup
Closed: fixed 7 years ago Opened 7 years ago by cfu.

when system resumes after a failure, log such audit event with information on failure type.
I have tentatively identified the following shutdown conditions:

  • shutdown by the administrator (graceful shutdown)
  • shutdown due to power failure of the system that Dogtag runs on
  • shutdown due to inaccessibility to the HSM (possibly power failure of the HSM)
    There is currently a "crumb" mechanism to detect signing failure to trigger a shutdown.
  • self tests failed (during system startup)
  • writing of an audit record fails (graceful shutdown)

We could possibly expand on the "crumb" mechanism to record detectable failures.


Metadata Update from @cfu:
- Custom field component adjusted to None
- Custom field feature adjusted to None
- Custom field origin adjusted to None
- Custom field proposedmilestone adjusted to None
- Custom field proposedpriority adjusted to None
- Custom field reviewer adjusted to None
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1522938
- Custom field type adjusted to None
- Custom field version adjusted to None

7 years ago

Metadata Update from @mharmsen:
- Issue assigned to jmagne

7 years ago

checkins:

commit 268cc70782b517c17439a17a5036f9f51182b650 (HEAD -> master, origin/master, origin/HEAD)
Author: Jack Magne jmagne@redhat.com
Date: Thu Feb 1 14:58:30 2018 -0800

Fix Bug 1522938 - CC: Missing failure resumption detection and audit event logging at startup

This patch addressed two cases listed in the bug:

1. Signing Failure due to bad HSM connection.
2. Audit log failure of some kind.

I felt the best and safest way to handle these conditions was to simply write to the
error console, which results in a simple System.err.println being sent to the former
catalina.out file now covered with the journalctl command.

I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained  situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce

and:

commit cdfe6f3e5a29fa061a0e6b6fb599dcddc19984c3 (HEAD -> DOGTAG_10_5_BRANCH, origin/DOGTAG_10_5_BRANCH)
Author: Jack Magne jmagne@redhat.com
Date: Thu Feb 1 14:58:30 2018 -0800

Fix Bug 1522938 - CC: Missing failure resumption detection and audit event logging at startup

This patch addressed two cases listed in the bug:

1. Signing Failure due to bad HSM connection.
2. Audit log failure of some kind.

I felt the best and safest way to handle these conditions was to simply write to the
error console, which results in a simple System.err.println being sent to the former
catalina.out file now covered with the journalctl command.

I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained  situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

Metadata Update from @jmagne:
- Issue close_status updated to: fixed
- Issue set to the milestone: 10.5.5 (was: 10.5)
- Issue status updated to: Closed (was: Open)

7 years ago

checkins;

Checkins:

commit 268cc70782b517c17439a17a5036f9f51182b650 (HEAD -> master, origin/master, origin/HEAD)
Author: Jack Magne jmagne@redhat.com
Date: Thu Feb 1 14:58:30 2018 -0800

Fix Bug 1522938 - CC: Missing failure resumption detection and audit event logging at startup

This patch addressed two cases listed in the bug:

1. Signing Failure due to bad HSM connection.
2. Audit log failure of some kind.

I felt the best and safest way to handle these conditions was to simply write to the
error console, which results in a simple System.err.println being sent to the former
catalina.out file now covered with the journalctl command.

I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained  situation, it is best to write the log out mostly simply.

and:

commit cdfe6f3e5a29fa061a0e6b6fb599dcddc19984c3 (HEAD -> DOGTAG_10_5_BRANCH, origin/DOGTAG_10_5_BRANCH)
Author: Jack Magne jmagne@redhat.com
Date: Thu Feb 1 14:58:30 2018 -0800

Fix Bug 1522938 - CC: Missing failure resumption detection and audit event logging at startup

This patch addressed two cases listed in the bug:

1. Signing Failure due to bad HSM connection.
2. Audit log failure of some kind.

I felt the best and safest way to handle these conditions was to simply write to the
error console, which results in a simple System.err.println being sent to the former
catalina.out file now covered with the journalctl command.

I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained  situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

(END)
I considered using some other dogtag log file, but if we are in some sort of emergency
or resource constrained situation, it is best to write the log out mostly simply.

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce
(cherry picked from commit 268cc70782b517c17439a17a5036f9f51182b650)

Quick testing instructions:

1. To see signing failure put this in the CS.cfg for ONLY testing purposes.

ca.signing.testSignatureFailure=true   , This will force an error when trying to sign and log it.

 Approve a certificate request, which will trigger a signing operation.
2. Check the journalctl for a log message.

3. Remove the config value to resume normal operation.

4. To see an audit log failure do the following:

[root@localhost signedAudit]# ps -fe | grep pki
pkiuser   8456     1  2 14:39 ?        00:00:32 /usr/lib/jvm/jre-1.8.0-openjdk/bin/java

lsof /var/lib/pki/pki-tomcat/ca/logs/signedAudit/ca_audit
java    9905 pkiuser  124u   REG  253,0    17298 3016784 /var/log/pki/pki-tomcat/ca/signedAudit/ca_audit

gdb /usr/lib/jvm/jre-1.8.0-openjdk/bin/java 8456   , Use the pid from above

Inside gdb do this:

call close(124)

This will close the file descriptor for the running server.

5. Now just try to do anything with the CS UI and observe errors written to the journalctl log,
having to do with not being able to write to the ca_adit file. If signed audid logging is configured,
many of these conditions will result in the the shutdown of the server.

Change-Id: I21c62a5ad6bedfe8678144a764bff2e2a4716dce

Status: ASSIGNED → POST

Metadata Update from @mharmsen:
- Custom field fixedinversion adjusted to pki-core-10.5.5-1.fc27

7 years ago

Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new
issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.

This issue has been cloned to GitHub and is available here:
https://github.com/dogtagpki/pki/issues/2997

If you want to receive further updates on the issue, please navigate to the
GitHub issue and click on Subscribe button.

Thank you for understanding, and we apologize for any inconvenience.

Log in to comment on this ticket.

Metadata