#5334 DNSSEC key purging is not handled properly
Closed: Fixed None Opened 8 years ago by pdf.

This is not the first time I've had this occur, so I'm reporting it now. One of my zones has stopped signing, and so for resolvers that validate DNSSEC (eg Google's public resolvers), the zone is effectively gone from the Internet. Last time it happened I just resigned the zone and waited for the propagation delays, but given the time periods involved, and that it has happened again, I'd like to try and get it fixed - having zones missing for days on a semi-regular basis is not a tenable situation.

Below is the stack trace I get when ipa-dnskeysyncd tries to synchronize the zone:

Sep 30 20:04:19 dnsmaster.example.com ipa-dnskeysyncd[25797]: ipa.ipapython.dnssec.bindmgr.BINDMgr: INFO
Sep 30 20:04:19 dnsmaster.example.com ipa-dnskeysyncd[25797]: ipa.ipapython.dnssec.bindmgr.BINDMgr: INFO     Synchronizing zone example.com.
Sep 30 20:04:19 dnsmaster.example.com ipa-dnskeysyncd[25797]: ipa.ipapython.dnssec.bindmgr.BINDMgr: INFO     attrs: {'idnsseckeyref': ['pkcs11:object=hash1'], 'dn': 'cn=ZSK-20150531030418Z-hash1,cn=keys,idnsname=example.com.,cn=dns,dc=example,dc=com', 'cn': ['ZSK-20150531030418Z-hash1'], 'idnsseckeypublish': ['20150531030419Z'], 'objectclass': ['idnsSecKey'], 'idnssecalgorithm': ['RSASHA256'], 'idnsseckeyzone': ['TRUE'], 'idnsseckeycreated': ['20150531030418Z']}
Sep 30 20:04:19 dnsmaster.example.com python2[25797]: detected unhandled Python exception in '/usr/libexec/ipa/ipa-dnskeysyncd'
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: Traceback (most recent call last):
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/libexec/ipa/ipa-dnskeysyncd", line 112, in <module>
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: while ldap_connection.syncrepl_poll(all=1, msgid=ldap_search):
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib64/python2.7/site-packages/ldap/syncrepl.py", line 409, in syncrepl_poll
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: self.syncrepl_refreshdone()       
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib/python2.7/site-packages/ipapython/dnssec/keysyncer.py", line 115, in syncrepl_refreshdone
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: self.bindmgr.sync()               
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib/python2.7/site-packages/ipapython/dnssec/bindmgr.py", line 194, in sync
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: self.sync_zone(zone)              
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib/python2.7/site-packages/ipapython/dnssec/bindmgr.py", line 177, in sync_zone
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: self.install_key(zone, uuid, attrs, tempdir)
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib/python2.7/site-packages/ipapython/dnssec/bindmgr.py", line 113, in install_key
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: basename = ipautil.run(cmd)[0].strip()
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: File "/usr/lib/python2.7/site-packages/ipapython/ipautil.py", line 373, in run
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: raise CalledProcessError(p.returncode, arg_string, stdout)
Sep 30 20:04:20 dnsmaster.example.com ipa-dnskeysyncd[25797]: subprocess.CalledProcessError: Command ''/usr/sbin/dnssec-keyfromlabel-pkcs11' '-K' '/var/named/dyndb-ldap/ipa/master/example.com/tmpazgVEt' '-a' 'RSASHA256' '-l' 'pkcs11:object=hash1;pin-source=/var/lib/ipa/dnssec/softhsm_pin' '-P' '20150531030419' 'example.com.'' returned non-zero exit status 1
Sep 30 20:04:20 dnsmaster.example.com systemd[1]: ipa-dnskeysyncd.service: main process exited, code=exited, status=1/FAILURE
Sep 30 20:04:20 dnsmaster.example.com systemd[1]: Unit ipa-dnskeysyncd.service entered failed state.
Sep 30 20:04:20 dnsmaster.example.com systemd[1]: ipa-dnskeysyncd.service failed.

Please let me know what additional information I can provide. This is a lesser-used domain, so I have a little leeway, but I would certainly appreciate any help that might be obtained in a timely fashion so it's not gone for too long.


Hrm, this has resolved itself somehow this time, though it was out for nearly 12 hours. I'll mark this as deferred for now, until I'm in a position to trouble-shoot again (if this is not the correct course of action, please let me know or update as you see fit).

Hmm, interesting. First of all, please add line debug=True into file /etc/ipa/default.conf and restart IPA. It will give us more details when this happens next time.

When it happens, please run following commands:

sudo -u named -s /bin/bash
source /etc/sysconfig/named
export PKCS11_PROVIDER
export SOFTHSM2_CONF
python2 /usr/lib/python2.*/site-packages/ipapython/dnssec/localhsm.py
<execute the dnssec-keyfromlabel-pkcs11 command which is failing according to the log>

Thank you!

Most probably duplicate of #5348.

@pspacek this doesn't appear to be a duplicate of #5348 since mbasti notes that restarting named-pkcs11 fixes that, however doing so has no impact on this issue. It is occurring for me now, so I have a window to troubleshoot before it either resolves itself as last time, or I have to bite the bullet and replace the keys (which I had to do the time before, and is obviously very painful due to timeouts and propagation).

I can't execute the failing command usefully because it depends on the stemp dir being created by whatever other process precedes it during normal execution.

Does the localhsm.py output include sensitive information? If so, can I send it to you off-tracker?

Replying to [comment:5 pdf]:

I can't execute the failing command usefully because it depends on the stemp dir being created by whatever other process precedes it during normal execution.

Ah, I forgot about that. Just replace the path to the directory with something existing, /tmp should be okay. The generated files do not contain any sensitive information, just timestamps and key ID.

Does the localhsm.py output include sensitive information? If so, can I send it to you off-tracker?
No, there is nothing sensitive in there, just key ID and attributes. The private key will never leave 'HSM' in an unencrypted form.

Is the error popping out on DNSSEC key master? (Running opendnssec's ods-enforcered service?) Or a replica? (Without ods-enforcered service?) Or both?

Replying to [comment:6 pspacek]:

Ah, I forgot about that. Just replace the path to the directory with something existing, /tmp should be okay. The generated files do not contain any sensitive information, just timestamps and key ID.

Executing the command simply outputs:

dnssec-keyfromlabel: fatal: failed to get key failingdomain.com/RSASHA256: not found

No, there is nothing sensitive in there, just key ID and attributes. The private key will never leave 'HSM' in an unencrypted form.

Will attach momentarily.

Is the error popping out on DNSSEC key master? (Running opendnssec's ods-enforcered service?) Or a replica? (Without ods-enforcered service?) Or both?

This is occurring on the master only.

Interesting ... Could you please upload output from localhsm.py (using the same commands) also from one other replica where it works?

Thank you!

Also, please provide output generated by command ods-ksmutil key list --verbose running on DNSSEC key master as described on page
http://www.freeipa.org/page/Troubleshooting#DNS_keys_are_not_generated_by_OpenDNSSEC

Thank you.

ods-ksmutil key list --verbose
ods-ksmutil.txt

Attached, thanks for taking a look.

Thank you very much. This is really really weird because the key is present on replica but not on the master, hmm. I will try to reproduce it locally. In meantime you might try to delete LDAP object cn=ZSK-20150530170525Z-ce36572b364d1ca95962273e57c28240,cn=keys,idnsname=failingdomain.com.,cn=dns,dc=example,dc=com and restart the ipa-dnskeysyncd, it might be enough to workaround it.

The affected key should not be used for signing anyway (inactivity period started at 20150920123018Z, it is apparently the old, already rotated ZSK key) so you might be lucky enough if you do not have TTLs longer than 3 days or so.

Please let me know if it helped or not.

Looks like I had one more key in the same state, having deleted them both I believe I'm in good shape - ipa-dnskeysyncd is happy and keys appear to be tracing correctly on all domains. Thanks you greatly.

Is there anything else I can do to help track down the source of this in future?

Ticket has not been triaged.

I think that we have enough information now, thank you very much for cooperation! It seems the we do not properly handle key purging in OpenDNSSEC which happens 14 days after the key is not used anymore. Considering the fact that ZSK is rotatech each 3 months, it explains why it happens so rarely. We have to fix this.

master:

  • 9bcb988 DNSSEC: Improve error reporting from ipa-ods-exporter
  • 9ff1c0a DNSSEC: Make sure that current state in OpenDNSSEC matches key state in LDAP
  • 21e6cc6 DNSSEC: Make sure that current key state in LDAP matches key state in BIND
  • e9cdaa1 DNSSEC: remove obsolete TODO note
  • 3c9c37c DNSSEC: add debug mode to ldapkeydb.py
  • 6bdc18d DNSSEC: logging improvements in ipa-ods-exporter
  • ddf7397 DNSSEC: remove keys purged by OpenDNSSEC from master HSM from LDAP
  • 43acb99 DNSSEC: ipa-dnskeysyncd: Skip zones with old DNSSEC metadata in LDAP
  • 9fbbe3e DNSSEC: ipa-ods-exporter: add ldap-cleanup command
  • fe263f7 DNSSEC: ipa-dnskeysyncd: call ods-signer ldap-cleanup on zone removal
  • ae24627 DNSSEC: Log debug messages at log level DEBUG

ipa-4-3:

  • 86ee493 DNSSEC: Improve error reporting from ipa-ods-exporter
  • d26c9c8 DNSSEC: Make sure that current state in OpenDNSSEC matches key state in LDAP
  • 84b7092 DNSSEC: Make sure that current key state in LDAP matches key state in BIND
  • c6efac4 DNSSEC: remove obsolete TODO note
  • c527e8f DNSSEC: add debug mode to ldapkeydb.py
  • 9b4be48 DNSSEC: logging improvements in ipa-ods-exporter
  • 819e3d4 DNSSEC: remove keys purged by OpenDNSSEC from master HSM from LDAP
  • b52cb1b DNSSEC: ipa-dnskeysyncd: Skip zones with old DNSSEC metadata in LDAP
  • 5070fa1 DNSSEC: ipa-ods-exporter: add ldap-cleanup command
  • b21492f DNSSEC: ipa-dnskeysyncd: call ods-signer ldap-cleanup on zone removal
  • 2e85644 DNSSEC: Log debug messages at log level DEBUG

ipa-4-2 needs rebase - later

ipa-4-2:

  • d2022d0 DNSSEC: Improve error reporting from ipa-ods-exporter
  • cb8a958 DNSSEC: Make sure that current state in OpenDNSSEC matches key state in LDAP
  • 52369bb DNSSEC: Make sure that current key state in LDAP matches key state in BIND
  • 9beb33c DNSSEC: remove obsolete TODO note
  • aa76c60 DNSSEC: add debug mode to ldapkeydb.py
  • 3eaabd9 DNSSEC: logging improvements in ipa-ods-exporter
  • 6647349 DNSSEC: remove keys purged by OpenDNSSEC from master HSM from LDAP
  • 2e6c3b3 DNSSEC: ipa-dnskeysyncd: Skip zones with old DNSSEC metadata in LDAP
  • dfefa6d DNSSEC: ipa-ods-exporter: add ldap-cleanup command
  • ab0b5e9 DNSSEC: ipa-dnskeysyncd: call ods-signer ldap-cleanup on zone removal
  • 614d9af DNSSEC: Log debug messages at log level DEBUG

Steps to reproduce

  1. Sign a DNS zone with DNSSEC
  2. Look at list of keys associated with the DNS zone (on DNSSEC key master):

    $ ods-ksmutil key list --verbose

  3. Compare this list with all the keys in HSM on master and all replicas:

    $ python2 /usr/lib/python2.*/site-packages/ipapython/dnssec/localhsm.py

  4. Wait for key rotation to happen (see /etc/opendnssec/kasp.xml for config)

  5. Wait for key purging interval to pass (see /etc/opendnssec/kasp.xml for config)

At this point, some of previously-used keys should be purged from HSM and LDAP. Check it the deleted keys were removed from OpenDNSSEC & HSM on all the replicas:
6. Look at list of keys associated with the DNS zone (on DNSSEC key master):

$ ods-ksmutil key list --verbose
  1. Compare this list with all the keys in HSM on master and all replicas:

    $ python2 /usr/lib/python2.*/site-packages/ipapython/dnssec/localhsm.py

Metadata Update from @pdf:
- Issue assigned to pspacek
- Issue set to the milestone: FreeIPA 4.2.4

7 years ago

Login to comment on this ticket.

Metadata