#7400 Rethink how we handle networking config with createImage (oz / ImageFactory)
Closed: Fixed 5 months ago by humaton. Opened 6 years ago by adamwill.

I spent a bit of time digging into the issues around https://pagure.io/fedora-kickstarts/pull-request/366 this morning, and I figure we can improve this area more generally, but I don't think the optimal path is completely obvious, so I thought I'd file an issue for discussion before filing any PRs.

As things stand, I believe, for all createImage tasks, we wind up with oz running the installer in a VM such that a single network interface is brought up, using a udev 'persistent' name. anaconda then writes out an ifcfg file for that 'persistent' name - e.g. ifcfg-ens3 or ifcfg-ens0p3 or something - to the installed system.

If you poke about in fedora-kickstarts a bit, you'll find that five kickstarts then do stuff to try and 'clean this up'. We can ignore fedora-cloud-bigdata and fedora-cloud-experimental, I think, so we're left with fedora-atomic, which does this:

bootloader --timeout=1 --append="no_timer_check console=tty1 console=ttyS0,115200n8 console=ttyAMA0 console=hvc0 net.ifnames=0"
network --bootproto=dhcp --device=link --activate --onboot=on

....

# Remove any persistent NIC rules generated by udev
rm -vf /etc/udev/rules.d/*persistent-net*.rules
# And ensure that we will do DHCP on eth0 on startup
cat > /etc/sysconfig/network-scripts/ifcfg-eth0 << EOF
DEVICE="eth0"
BOOTPROTO="dhcp"
ONBOOT="yes"
TYPE="Ethernet"
PERSISTENT_DHCLIENT="yes"
EOF

....

# For trac ticket https://pagure.io/atomic-wg/issue/128
rm -f /etc/sysconfig/network-scripts/ifcfg-ens3

fedora-cloud-base, which does this:

bootloader --timeout=1 --append="no_timer_check net.ifnames=0 console=tty1 console=ttyS0,115200n8"
network --bootproto=dhcp --device=link --activate --onboot=on

....

# simple eth0 config, again not hard-coded to the build hardware
cat > /etc/sysconfig/network-scripts/ifcfg-eth0 << EOF
DEVICE="eth0"
BOOTPROTO="dhcp"
ONBOOT="yes"
TYPE="Ethernet"
PERSISTENT_DHCLIENT="yes"
EOF

....

# When we build the image with oz, dracut is used 
# and sets up a ifcfg-en<whatever> for the device. We don't 
# want to use this, we use eth0 so it is always the same. 
# So we remove all these ifcfg-en<whatever> devices so 
# The 'network' service can come up cleanly.
rm -f /etc/sysconfig/network-scripts/ifcfg-en*

And fedora-disk-base, which does this:

network --bootproto=dhcp --device=link --activate
bootloader --timeout=1

....

# The enp1s0 interface is a left over from the imagefactory install, clean this up
rm -f /etc/sysconfig/network-scripts/ifcfg-enp1s0

Note they all do something different for trying to remove the 'predictably'-named interface configuration produced by the installer. We touched the cloud-base version most recently, and it's most probably the best approach.

disk-base doesn't try to disable 'persistent' naming for the installed system, which I believe is intentional and correct, as that kickstart is for the ARM disk images which will be booted on real hardware with potentially varying network adapters and should probably respect the distro default to use 'persistent' naming on real hardware. I don't know how this results in the network coming up on first boot of these images, if it even does, but hey, that seems a bit out of scope.

atomic and cloud-base both try to disable 'persistent' naming for the installed system. cloud-base used to try and do it just by removing udev rules, which was dumb, so @kevin recently changed that to add net.ifnames=0 to the kickstart bootloader line, which basically disables udev 'predictable' naming (the udev rules will see the param and not rename the device). atomic seems to do a belt-and-braces approach ATM: it both sets net.ifnames=0 in the bootloader line and tries to remove udev rules, badly (they don't live in /etc/udev/rules.d any more).

One obvious cleanup would be to make all three use the rm -f /etc/sysconfig/network-scripts/ifcfg-en* strategy for removing the 'predictably'-named interface config file. Another obvious cleanup would be to take the udev rule deletion bits out of fedora-atomic, since they're not doing anyone any good there. Potentially, since at least Atomic and Cloud probably always want to do the same thing here, we could break their bits out into a shared file they'd both %include, to keep them consistent in future.

There is another thing we can do, though. oz actually permits (since https://github.com/clalancette/oz/commit/14ad1922aa8c0922aaa2a3f9e52daa2692939e64 , I think that's version 0.13.0) the passing of arbitrary extra kernel parameters to the installer, at least for the RHEL/Fedora 'url' install type, which is what createImage uses. You just have to add a kernelparam entry to the template.

We could send a PR for Koji to use this to disable 'predictable' interface naming during the image creation; either to just always include a kernelparam value of biosdevname=0 net.ifnames=0 for createImage tasks, or to make this configurable via the command line somehow and then have pungi / pungi-fedora pass it in for the images we want to use it for (handwave handwave).

This should mean anaconda would write an ifcfg-eth0 into the installed system, and for Atomic and Cloud images, in theory we wouldn't have to mung anything in the kickstart. @kevin points out that the file might in fact specify a MAC address, which we would have to filter out, but we might be able to tweak the kickstart to avoid that, I'll look into it.

For ARM images, we'd just want to tweak the kickstart to remove that ifcfg-eth0 file instead of the name it currently tries to remove - but this is slightly better than the current situation, where we've had at least one case where the 'predictable' interface name suddenly became unpredictable. At least in this particular workflow, the name eth0 for the sole interface used during the image build should be truly predictable.

Anyway, that's where I'm up to. Thoughts / comments / ideas on this?

@kevin @mohanboddu @dustymabe @walters @mikem @lsedlar


Metadata Update from @mohanboddu:
- Issue tagged with: meeting

6 years ago

i'm +1 to the patch oz approach. I'd prefer not to break out and make the atomic/cloud kickstarts use a common other kickstart (mostly because the relationship between the kickstarts is already confusing enough).

side note.. we cloud change the structure of our kickstarts repo to make subdirectories for each deliverable and then symlink included files in each subdir. Something like:

include/
├── fedora-repo.ks
├── fedora-repo-not-rawhide.ks
└── fedora-repo-rawhide.ks
fedora-cloud-base/
├── fedora-cloud-base.ks
├── fedora-repo.ks -> ../include/fedora-repo.ks
├── fedora-repo-not-rawhide.ks -> ../include/fedora-repo-not-rawhide.ks
└── fedora-repo-rawhide.ks -> ../include/fedora-repo-rawhide.ks
fedora-cloud-base-vagrant/
├── fedora-cloud-base.ks -> ../fedora-cloud-base/fedora-cloud-base.ks
├── fedora-cloud-base-vagrant.ks
├── fedora-repo.ks -> ../include/fedora-repo.ks
├── fedora-repo-not-rawhide.ks -> ../include/fedora-repo-not-rawhide.ks
└── fedora-repo-rawhide.ks -> ../include/fedora-repo-rawhide.ks

It might seem convoluted, but at least it's very clear what kickstarts depend on what other kickstarts.

Metadata Update from @dustymabe:
- Issue untagged with: meeting

6 years ago

@dustymabe I guess you mean 'patch Koji'? We don't need to patch oz, oz already has the capability we need.

@dustymabe I guess you mean 'patch Koji'?

yes, correct

@mohanboddu plans to bring this up in the Mar 13 releng meeting.

Metadata Update from @syeghiay:
- Issue tagged with: meeting

5 years ago

This is a great idea and we need changes to koji.

From RelEng meeting on May 29th 2019:

[12:10:38] <nirik> so right now, the oz images install with a ensa263152761 whatever interface
[12:10:56] <nirik> which gets written to a ifcfg file by NM
[12:11:15] <nirik> so we have to nuke it in the kickstart to allow for someone using that image to get their own interface
[12:11:42] <nirik> if we just set everything to use eth0 it's easier to nuke
[12:13:20] <nirik> so, I'm in favor of making these changes...
[12:14:17] <mboddu> nirik: So, set to use eth0 and rm -f /etc/sysconfig/network-scripts/ifcfg-eth0?
[12:14:27] <nirik> yeah

pinging @mikem (or @adamwill if he can send the PR's)

From RelEng meeting on Jul 03rd 2019:

[12:37:23] <nirik> no, I don't think thats done.
[12:37:28] <nirik> we need the actual koji patch. ;)
[12:38:07] <nirik> "We could send a PR for Koji to use this to disable 'predictable' interface naming during the image creation; either to just always include a kernelparam value of biosdevname=0 net.ifnames=0 for createImage tasks, or to make this configurable via the command line somehow and then have pungi / pungi-fedora pass it in for the images we want to use it for (handwave handwave)."
[12:39:15] <nirik> so, I guess we should file a koji bug asking... and/or find someone to write the patch

@mohanboddu or @lsedlar , did this koji PR get filed? What's the status?

Is this something we want to move forward with?

No, I think we should be spending the effort moving to osbuild/ImageBuilder and put the effort in there. We now have the initial pieces in infrastructure and are working to make it more usable.

We are closing this in favor of moving to osbuild/ImageBuilder. If necessary releng can reevaluate this request in the future.

Metadata Update from @humaton:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

5 months ago

Login to comment on this ticket.

Metadata