#9852 DNS Mini-initiative
Opened 3 months ago by smooge. Modified 2 months ago

Describe what you would like us to do:

The DNS system needs some restructuring due to limitations being found in the current versions used.

  1. The version of BIND in EL7 and EL8 does not allow a large number of TCP connections however Fedora's version of various tools try TCP by default and UDP only on request. This means that dig and other commands are not working. It is recommended to move the DNS servers to Fedora 33 and update as it goes.
  2. The default timeouts for the main fedoraproject.org zone is short which means that those systems are wanting more TCP connections per time. The various DNS maintainers say we should look at moving $TTL to bigger and put in short times for the records we expect to be short lived (wildcard and NS records).
  3. The DNS git repo is 4.4 GB in size for 60 MB of files. We need to look at cleaning up the history as most of those 4.4 GB are just dns signed records we do not really need to keep history of?
    [external recommendation.. set up a hidden master DNS in IAD2 with the keys and zonetransfer to the dns servers we need these to go to.]

When do you need this to be done by? (YYYY/MM/DD)

To be clear, TCP will get less of an issue once https://pagure.io/fedora-infrastructure/issue/9422 is done. The new algorithm has even smaller signatures than the old one.

The problem is getting there because the transition state has two signatures, making replies much bigger and requiring TCP.

As for TTLs, https://00f.net/2019/11/03/stop-using-low-dns-ttls/ sums up the problem. Obviously it's always a compromise.

Maybe even some of the potentially dynamic records can do with longer TTLs. E.g. NS records you mentioned are used exclusively by DNS resolvers, and resolvers are well equipped to handle non-responsive servers and do fallback as needed.

This is not to say stale data should be kept indefinitely, I'm just trying to find a middle ground. I would have to know more about your load balancing/proxying to provide a more useful recommendation. Ping me if you are interested.

Metadata Update from @mohanboddu:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: low-gain, medium-trouble, mini-initiative, ops

3 months ago

It seems pointless to me to keep signed zone files in history. Ensure unsigned zone data are stored with all keys used, but not always changing signatures itself. ldns-read-zone -s can be used to drop DNSSEC data from the zone. Inline signing might be used on the server to create signatures, just ensure it has enough of random entropy. Or save a deploying script, which would create signed zone from unsigned and keys.

The current infrastructure for making DNS work is the following:

an admin makes changes to the files in the DNS repository. they then run commands like the following function:

dnscommit ()
    local args=$1;
    cd ~/dns;
    git commit -a -m "${args}";
    git pull --rebase && ./do-domains && git add built && git commit -a -m "Signed DNS" && git push

The do-domains script does some checking and then runs the jinja2 templates on the various files and then does all signatures (though not correctly with 2 sets of keys). Private keys are stored on batcave in a limited use location.

Once named checkzone says its ok, the commits happen and there are git triggers which do some work.

Each of the Fedora dns servers usually do a regular checkout of the git repository on batcave to get an updated tree. If there was a git update then named reloads to get the changes. A force push can be done from batcave which basically forces the git update on each client. Changes to this will need either a parallel tooling or refactoring of the current system.

Login to comment on this ticket.

Boards 2
ops Status: Backlog
mini-initative Status: Backlog