systemd-networkd, systemd-timesyncd, systemd-resolved are daemons which run as non-root users for safety. They are started in early boot, and are slated to be started in the initramfs. They create state files on disk, so the same uid/gid have to be used in the initramfs and on the real system. To make this work with host-independent initramfs images, uid and gid have to be allocated statically so they are the same as on the host system.
systemd-timesyncd, systemd-networkd, systemd-resolved are not yet ready for general use cases, but they are packaged as disabled by default in F21+.
Please approve systemd-timesync, systemd.network, systemd-resolve users and groups. (Note: without the "d" at the end.)
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1102002
We discussed this at today's meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2014-12-18/fpc.2014-12-18-17.01.txt):
"host-independent initramfs images" means generic initramfs images, which carry no host specific information to be used universally. For example a generic PXE initramfs for multiple hosts. So, if multiple hosts are started via a PXE initramfs, the uid/gid of initramfs started services/tools must be fixed, if these tools create files in /run, /dev or /dev/shm.
Hmmm. I'm trying to think of a workaround but I can't. Even if the installed OS gets the UIDs from whatever the initramfs uses, that is basically the same thing as simply allocating them.
I really, really hope that this kind of thing doesn't proliferate, though.
Also, could someone clear up whether it's "systemd.network" or "systemd-network" and if it's the former, please change whatever needs to be changed to make it the latter.
Replying to [comment:4 tibbs]:
Also, could someone clear up whether it's "systemd.network" or "systemd-network" It's systemd-network. Sorry for the typo. I really, really hope that this kind of thing doesn't proliferate, though. There's only so many daemons you want to start in the initramfs... We don't have any further plans at the moment, fwiw.
Also, could someone clear up whether it's "systemd.network" or "systemd-network" It's systemd-network. Sorry for the typo.
I really, really hope that this kind of thing doesn't proliferate, though. There's only so many daemons you want to start in the initramfs... We don't have any further plans at the moment, fwiw.
OK, that's all of the committee's questions answered. Moving this back to meeting. Due to the holidays, our next meeting will be January 8th, 17:00Z. We can still have a discussion in this ticket, of course, instead of waiting for the meeting.
For the record, I'm tending towards a reluctant +1 here, but I'll wait to see if others have anything to say that would change my mind.
Note that I think it'd be possible to chown() the shared state when transitioning from the initramfs. systemd would just need to do it very early. It wouldn't be beautiful of course, so I'm not really objecting to this.
Also worth noting that rpm-ostree/Atomic forces host-independent initramfs images always, not just for PXE.
We discussed this at this weeks meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2015-01-08/fpc.2015-01-08-17.01.txt), summary is:
I gave this some thought, and discussed things with other systemd developers, especially Tom Gundersen, the person responsible for systemd-networkd. Our agreement is that static uids are the right solution. Doing a chown is technically feasible, but ugly and unpleasant, as described in detail below. systemd-timesyncd does not currently store state in the filesystem, so if the FPC wants to conserve one static uid, it might want to drop it from the list. Nevertheless, I like consistency and would prefer to have all three.
Generally, the way that those daemons would be started in the initramfs, and then transition to the real system, follows the strategy of systemd-journald. They are started in the initramfs as systemd services, run until the switchroot happens, and are restarted by the main systemd instance. This strategy works very nicely with journald, and we want to follow the same pattern.
Why can't you chown in the initramfs? First, it would be necessary to read the right uid.gid from the /sysroot/etc/passwd, and then change the ownership of the files. This is possible, but since the uids might be assigned to a different name in the initramfs, there would be a time when those files would be owned by an unrelated user. Not an issue, but not beautiful.
A bigger problem is that the services must be stopped before doing this operation. If something goes wrong, and the user is thrown into the debug shell, they will see a state where those services are not running, and cannot be started before the uids are manually restored to the old version. For example, the user might want to run systemd-network to have a working dhcp client to read some documentation.
Why not chown in the real system?
Considerations are similar. There's a time where files on disk are owned by an unrelated user and group. This means that the files should be chown'ed before any service which might use the same uids is started. So not only the services in question are influenced, but basically any service started by systemd. Changing uids on disk is not something that we want PID 1 to do, it would be done systemd-tmpfiles, but this means that initialization of other services has to be serialized after the chown operation. Possible, but brittle and ugly.
What kind of state/data is in the initramsfs, or is it just the uid security? systemd-network currently stores DHCP lease information in /run/systemd/netif/lease. Information about links as seen by networkd is stored in /run/systemd/netif/links. It is used by other systemd-software to query network status. systemd-resolved will store information about servers and keep a cache of resolved queries. systemd-timesyncd so far does not store any runtime data, only a timestamp in /var, which probably would not be touched in the initramfs. Why do they need to be started before the system has booted? systemd-netword → is intended to be a general purpose network configuration tool. So the usual reasons apply — boot over NFS, iscsi, etc. Logging of early boot over the network. Troubleshooting.
What kind of state/data is in the initramsfs, or is it just the uid security? systemd-network currently stores DHCP lease information in /run/systemd/netif/lease. Information about links as seen by networkd is stored in /run/systemd/netif/links. It is used by other systemd-software to query network status. systemd-resolved will store information about servers and keep a cache of resolved queries. systemd-timesyncd so far does not store any runtime data, only a timestamp in /var, which probably would not be touched in the initramfs.
Why do they need to be started before the system has booted? systemd-netword → is intended to be a general purpose network configuration tool. So the usual reasons apply — boot over NFS, iscsi, etc. Logging of early boot over the network. Troubleshooting.
systemd-timesyncd → configuration of time for systems without RTC so they can switch into the real system with correct time, joining of kerberos domains, certificate and dnssec resolution, and other things which require precise time.
systemd-resolved → name resolution, including dnssec support.
Those three work together, and both timesyncd and resolved are "clients" of networkd and can use it to query information about the network, for example dns servers configured per link, for example if more than one dhcp lease is established, or there's a vpn connection or whatever. Having them in the initramfs is useful because together they provide full network support.
We discussed this at today's meeting (http://meetbot.fedoraproject.org/fedora-meeting-1/2015-01-15/fpc.2015-01-15-17.00.txt):
Doing a chown is technically feasible, but ugly and unpleasant, as described in detail below. systemd-timesyncd does not currently store state in the filesystem, so if the FPC wants to conserve one static uid, it might want to drop it from the list Generally, the way that those daemons would be started in the initramfs, and then transition to the real system, follows the strategy of systemd-journald This is possible, but since the uids might be assigned to a different name in the initramfs, there would be a time when those files would be owned by an unrelated user. Not an issue, but not beautiful.
Doing a chown is technically feasible, but ugly and unpleasant, as described in detail below. systemd-timesyncd does not currently store state in the filesystem, so if the FPC wants to conserve one static uid, it might want to drop it from the list
Generally, the way that those daemons would be started in the initramfs, and then transition to the real system, follows the strategy of systemd-journald
This is possible, but since the uids might be assigned to a different name in the initramfs, there would be a time when those files would be owned by an unrelated user. Not an issue, but not beautiful.
Right, this is annoying ... nobody is arguing with that. But to some extent this is the cost of doing work before you have a real root. You should also understand that the cost of allocating a static uid is very high. Having journald use one is far from perfect, but at least everyone will be using that and it's a single service so there's little problem with more and more services requiring more and more static uids.
They are started in the initramfs as systemd services, run until the switchroot happens, and are restarted by the main systemd instance. This strategy works very nicely with journald, and we want to follow the same pattern
When you restart at the point of switch root, isn't that an ideal time to merge the uids with the host system? Also I assume you can just delete all the pre switchroot data for resolving, and just boot a new version which will get it's own copy as the correct uid?
systemd-netword → is intended to be a general purpose network configuration tool. So the usual reasons apply — boot over NFS, iscsi, etc. Logging of early boot over the network. Troubleshooting
So everything flows from this, and it's hard to imagine that boot over NFS/iscsi is such a huge feature we need to pull out all the stops to make it as easy as possible to happen. Both of the other features are to allow network logging and seem like journald should be buffering things until they can be sent across the network/whatever ... or someone needs to setup a serial console.
Again, the main worry isn't so much adding 2-3 random new static uids ... it's that if we are adding these so that boot over NFS is a little bit easier to implement, then what about when someone wants to boot over SMFS or cephfs or have avahi running so they can find some services dynamically.
Let's postpone this for 3-4 weeks. I'd like to get this resolved before F22, but it doesn't have to be now, and I have other packaging stuff.
Somehow the bar for allocating a static uid has been raise from "give one to anyone who asks" to "prove that there's no other way". In 2014 one uid was added, and one removed. I think that the worries about a flood of requests are overstated.
There are two major sources of requests for static uids: initramfs tools, and things which export stuff over the network. While the second one can grow, there's a natural bound on the first one.
Replying to [comment:12 james]:
This is possible, but since the uids might be assigned to a different name in the initramfs, there would be a time when those files would be owned by an unrelated user. Not an issue, but not beautiful. Right, this is annoying ... nobody is arguing with that. No, this is not annoying, it is technically wrong. Sometimes ugly things are unavoidable, but why build a system from scratch with a race? When you restart at the point of switch root, isn't that an ideal time to merge the uids with the host system? I think I answered that adequately in #comment:10 (paragraphs starting with "Why can't you chown in the initramfs?" and "Why not chown in the real system?"). If something is unclear, let me know. Also I assume you can just delete all the pre switchroot data for resolving, and just boot a new version which will get it's own copy as the correct uid?
Right, this is annoying ... nobody is arguing with that. No, this is not annoying, it is technically wrong. Sometimes ugly things are unavoidable, but why build a system from scratch with a race?
When you restart at the point of switch root, isn't that an ideal time to merge the uids with the host system? I think I answered that adequately in #comment:10 (paragraphs starting with "Why can't you chown in the initramfs?" and "Why not chown in the real system?"). If something is unclear, let me know.
Also I assume you can just delete all the pre switchroot data for resolving, and just boot a new version which will get it's own copy as the correct uid?
Yes, but this comes with a price. DHCP resolution takes on the order of a second on real network, DNS queries, especially with DNSSEC enabled, take a bit too. On fast systems things like bringing up the network are becoming one of the slowest operations, so it is nice to cache this if possible.
Also, if some filesystem is mounted over the network using the DHCP address, the address should not be touched. If a DHCP lease is re-requested from the server, some DHCP servers will give out a different address. This is a bug on their side, but very hard to fix. So re-requesting a DHCP lease sometimes must be avoided.
systemd-netword → is intended to be a general purpose network configuration tool. So the usual reasons apply — boot over NFS, iscsi, etc. Logging of early boot over the network. Troubleshooting So everything flows from this, and it's hard to imagine that boot over NFS/iscsi is such a huge feature we need to pull out all the stops to make it as easy as possible to happen. Both of the other features are to allow network logging and seem like journald should be buffering things until they can be sent across the network/whatever ... or someone needs to setup a serial console. Again, the main worry isn't so much adding 2-3 random new static uids ... it's that if we are adding these so that boot over NFS is a little bit easier to implement, then what about when someone wants to boot over SMFS or cephfs or have avahi running so they can find some services dynamically.
I wouldn't consider avahi a crucial part of the system. But anyway, I would think that each request should be evaluated on its own merits.
I'll try to make it to the meeting tomorrow, since I see that this ticket is on the agenda.
Well, it was not really "give one to anyone who asks - I rejected quite a few when it was my responsibility - however I agree that bar was much lower - basically - give it where it makes at least some sense. Personally, I think static ids are better way (e.g. because of containers nowadays). With soft-static ids approach, there is no drawback from the ids allocation - and we now have 1000 places for system ids. On my system, I just have 16 dynamic system accounts - not too much.
We discussed this at the meeting today (http://meetbot.fedoraproject.org/fedora-meeting-1/2015-01-29/fpc.2015-01-29-17.01.txt):
Just to be sure - does that mean systemd-network and systemd-resolve accepted and systemd-timesync rejected?
Well, timesync didn't get enough votes. It could conceivably get the two additional votes as there were some committee members who were not at the meeting. I won't speculate on how likely that would be, and I won't try to summarize the various arguments. If you want us to keep it open for voting, I can do that.
No, let's keep it closed. I'll reopen if need arises.
Metadata Update from @james: - Issue assigned to tibbs
Log in to comment on this ticket.