#49041 SSL fails to start when nss new db format is default
Closed: wontfix 7 years ago Opened 7 years ago by firstyear.

Due to the design of our cert and key checks we looked for cert7.db and cert8.db, as well as key3.db.

Of course, we now have newer NSS in fedora, so when we start DS this fails as NSS now defaults to key4.db and cert9.db. DS then touches the empty files key3.db and cert8.db. When you restart DS, your certificates magically vanish and SSL stops working.

This probably needs to be fixed for F24 asap as it does affect that version. F23 doesn't seem to be affected.


nss version?

Mine is nss-3.27.0-1.2.fc24.x86_64.

And I don't see the problem on my F24...

ls /etc/dirsrv/slapd-test/*.db

/etc/dirsrv/slapd-test/cert8.db
/etc/dirsrv/slapd-test/key3.db
/etc/dirsrv/slapd-test/secmod.db

Please see also this ticket.
https://fedorahosted.org/389/ticket/48760
NSS -- switching to the sql db

I don't think NSS stops supporting the old BDB format soon...
Do you happen to have any info about it?

If you make a new database (which you have to, because we ship/create broken key3.db / cert8.db files) you end up with:

{{{
key4.db
cert9.db
pkcs11.txt
}}}

You put your certificates into these, and everything is great, so you start up Directory Server. NSS inits, reads the files and it's okay (SSL starts), but we also detect "ohh no, there is no key3.db and cert8.db!".

So Directory Server then touches these files.

When you restart Directory Server, NSS then attempts to read the key3.db and cert8.db, rather than key4.db and cert9.db. Your SSL fails to start, and you have a broken NSS database as well.

So this issue will only affect newly created instances on Fedora, and only after a restart. But it took me a while to work it out, and it was annoying to solve.

It's our bug for not detecting that we have a valid SQL format db in place, it has nothing to do with the BDB format database.

Ok. I understood your concern...

There's no plan to change the NSS default db format yet (I got the words from Bob). So, it's not urgent. But preparing for the time would be a good thing, I guess.

Instead of bumping the number, can we have key[0-9].db and cert[0-9].db checked in warn_if_no_key_file and warn_if_no_cert_file, respectively, then? I think hardcoding such value would not be a good idea...?

Replying to [comment:4 nhosoi]:

Ok. I understood your concern...

This isn't a "concern", this actually is happening now on Fedora 24, I found it testing yesterday during a feature development.

There's no plan to change the NSS default db format yet (I got the words from Bob). So, it's not urgent. But preparing for the time would be a good thing, I guess.

Instead of bumping the number, can we have key[0-9].db and cert[0-9].db checked in warn_if_no_key_file and warn_if_no_cert_file, respectively, then? I think hardcoding such value would not be a good idea...?

I think we shouldn't check them at all, it's silly. We should leave that to NSS. We should say "NSS do you have cert / key files" and it says yes or no.

Lets stop messing about where we don't belong, and getting it wrong.

This patch is a temporary fix to this.

I still don't get it.

The default format of NSS database files is still BDB so certutil -N will generate key3/cert8 db files, not key4/cert9. How did you generate those?

If the shipped files are broken why not just fix the broken files rather than making code changes? As Noriko points out you'll accept the sqlite database formats without actually supporting them. Why detect something you don't yet support, though you seem to suggest that it actually does startup ok, which is also confusing.

As for the detection code that likely traces back to a support case sometime in the very distance past given the cert7 reference.

I still don't understand, either...

The NSS key/cert db files are NOT wiped out due to the create_certdb flag change... The flag is used just for updating the db file's mode to 0600.

If they were wiped out, it was done in the NSS api since the DS is expecting BDB format, but the existing ones are not?
{{{
secStatus = NSS_Initialize(certdir, NULL, NULL, "secmod.db", nssFlags);
}}}

Right, you make a new server. You get:

{{{
ls /etc/dirsrv/slapd-instance/
dse.ldif
}}}

So you start that up, with nsslapd-secure: off

{{{
ls /etc/dirsrv/slapd-instance/
dse.ldif
key3.db <<-- BROKEN
cert8.db <<-- BROKEN
}}}

Still all good, even though you have broken files. Why? DS touches them in place, and they are blank.

Anyway, so as any good sysadmin does you remove the broken files.

{{{
rm key3.db cert8.db
}}}

And you make a new database.

{{{
certutil -N -d . .....
}}}

{{{
ls /etc/dirsrv/slapd-instance/
dse.ldif
key4.db <<-- Good!
cert9.db <<-- Good!
}}}

You add your certs with whatever magic you need.

{{{
ls /etc/dirsrv/slapd-instance/
dse.ldif
key4.db <<-- Good with certs!
cert9.db <<-- Good with certs!
}}}

Cool, so now you start DS, and it reads from NSS and starts with SSL which is good. BUT!!!!!!!

Because it cannot find key3 and cert8 it touches them again ....

{{{
ls /etc/dirsrv/slapd-instance/
dse.ldif
key4.db <<-- Good with certs!
cert9.db <<-- Good with certs!
key3.db <<-- BROKEN
cert8.db <<-- BROKEN
}}}

Now you restart DS and it reads from key3.db FIRST which is broken so SSL fails to start. It NEVER ATTEMPTS to use key4.db.

The issue is NOT with NSS.
The issue is DS should NOT be touching these files, or messing with them, because we get it wrong.

My fix checks if one or the other database exists and prevents the touch if one of them is true.

The proper fix is we should not touch any of these files at all, and just let NSS do it's job.

Per weekly triage: 1.3.7

Could it be possible to provide the duplicator (or steps to reproduce)?

Metadata Update from @nhosoi:
- Issue assigned to firstyear
- Issue set to the milestone: 1.3.7 backlog

7 years ago

Set in your env NSS_DEFAULT_DB_TYPE=sql, and run any test :)

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to new (was: review?)
- Issue close_status updated to: None

7 years ago

Metadata Update from @firstyear:
- Custom field reviewstatus adjusted to review (was: new)

7 years ago

Metadata Update from @mreynolds:
- Custom field reviewstatus adjusted to ack (was: review)

7 years ago

commit f6ec67e
To ssh://git@pagure.io/389-ds-base.git
f56a927..f6ec67e master -> master

Metadata Update from @firstyear:
- Issue close_status updated to: fixed
- Issue status updated to: Closed (was: Open)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/2100

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: fixed)

4 years ago

Log in to comment on this ticket.

Metadata