#9569 please allocate 40 IP addresses to copr hypervisor
Closed: Fixed 10 days ago by smooge. Opened 3 months ago by praiskup.

Hypervisor hostname: vmhost-x86-copr01.rdu-cc.fedoraproject.org

We should be OK to run ~32 copr builders on that machine (and few more
smaller VMs for devel copr instance), but due to the nature of copr
(copr-backend is controlling builders) we need to have access to the
builders from copr-backend machine (currently hosted in AWS).

We don't need DNS records for the builders, and DHCP either. Even though
being able to use DHCP would be awesome.


Even though being able to use DHCP would be awesome.

We already had static IP allocation on copr builders historically, when
virthost-aarch64-os01.fedorainfracloud.org and virthost-aarch64-os02.fedorainfracloud.org were up&running (old lab).

We don't run the IP allocation space for the OSPO cage where this system is. I also do not think they have 40 free ip addresses on this space. You will need to contact @misc 's team for this.

We also do not have dhcp on that network as the GNOME group has an open DHCP server which conflicts with any other server.

The only way I could see this working is that the vmhost allocates its own private network and talks to the copr systems that way.

Hmpf. That will non-trivially complicate the configuration, at least I don't think
it is wise to allocate VPN for each vmhost that might be connected to copr in
the future :-/

Could we use some pre-existing VPN in Fedora infrastructure? (which would
mean that builders would have some IP range there, and copr-backend too)

Hmpf. That will non-trivially complicate the configuration, at least I don't think
it is wise to allocate VPN for each vmhost that might be connected to copr in
the future :-/

Could we use some pre-existing VPN in Fedora infrastructure? (which would
mean that builders would have some IP range there, and copr-backend too)

I don't think we want to mix builders doing untrusted 3rd party builds with our database and application backend servers. ;(

We could of course setup another one for copr, but then it's just like you were doing it directly, not much advantage in having it central?

Metadata Update from @mohanboddu:
- Issue tagged with: medium-gain, medium-trouble, ops

3 months ago

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

3 months ago

I guess that ipv6 wouldn't make any difference?

I don't think we want to mix builders doing untrusted 3rd party builds with our database and application backend servers. ;(

understood

We could of course setup another one for copr, but then it's just like you were doing it directly, not much advantage in having it central?

Yes, our team would take care of this...

I talked with misc and the site has plenty of ipv6 space. If the systems can be ipv6 only then you can have space.

We can not disable ipv4 stack entirely - but I suppose it can be IPv4 behind NAT,
and accessible via ipv6 from the outside?

I'll try to chat with our team .. the thing is that we don't have ipv6 on copr-backend yet, which would require us to configure the ipv6 networking in us-east-1c AWS availability zone.

Ok, IPv6 sounds better to us. Both it should be easier to configure, and there
won't be a networking bottleneck (no need to route any trafic through single
box, either in AWS or in our lab).

  • So we'll need to configure copr-backend to have some ipv6, first, which require
    us to configure us-east-1c - can that be done please? Without this we can not
    assign any IPv6 to the host in the AWS UI

  • then we need to allocate some IPv6 range that we'll assign manually later on

Should I split this ticket to two?

@praiskup I have added an ipv6 cidr block to vpc-0afefac8bae905972 which is where your instances appear to be running.

You will likely possibly need some on host editing to get it to work. I know that normally Fedora cloud images and AWS ipv6 don't work together out of the box. I had an issue with this before and had to set static ips on the hosts

Thanks! I tried to start new VM in subnet-0995f6a466849f4c3, and it says:

Subnet does not contain any IPv6 CIDR block ranges

Which is weird. Perhaps I'm doing something wrong? I thought that - when the subnet is configured - I'l be able to assign ipv6 to an existing instance (== no need to re-spawn the copr backend machine from scratch?).

You will likely possibly need some on host editing to get it to work.

We should be OK to test the process on devel machine, so overall it should be non-risk.

The network vpc-0afefac8bae905972 is newly listed as vpc-0afefac8bae905972 | copr which IMO isn't entirely correct as other teams probably use this network as well (but I didn't check, historically we used this one because it was the only one unassigned to a particular interest group).

The network vpc-0afefac8bae905972 is newly listed as vpc-0afefac8bae905972 | copr which IMO isn't entirely correct as other teams probably use this network as well (but I didn't check, historically we used this one because it was the only one unassigned to a particular interest group).

Sorry, my bad. I can unassign that

Thanks! I tried to start new VM in subnet-0995f6a466849f4c3, and it says:

Subnet does not contain any IPv6 CIDR block ranges

Which is weird. Perhaps I'm doing something wrong? I thought that - when the subnet is configured - I'l be able to assign ipv6 to an existing instance (== no need to re-spawn the copr backend machine from scratch?).

This should work now, I had the VPC configured but not all the subnets

OK, I was able to assign IPv6 to the existing VM, and new one. I don't seem to get routed to the external world (ping6 ipv6.google.com doesn't work) but i suppose I have to fix something on the VMs, not in the cloud. Thank you! Lemme experiment with that the day after tomorrow.

Can we please allocate the range of ipv6 addresses to the vmhost I mentioned above?

There is an ip6 entry in the route table so ping6 should work. May be a host issue

Ok, the new box was routing just fine from the beginning. The old box (current copr-backend-devel machine) had to be restarted through AWS (so aws knew it was restarted) and then it started routing. Those two boxes I talk about can not communicate with each other over ipv6.

Could we please allocate the ipv6 range for the vmhost-x86-copr01.rdu-cc.fedoraproject.org?

@smooge / @misc whats the process for this? pick a /64 and put it in... a doc?

So I may be wrong, but we usually use EUI-64 to assign IP (eg, it is derived from the mac, but as that's not how we documented it on https://osci.io/infra_procedures/host_network_setup/ , maybe I am wrong ).
Each server has its own /64 range. AFAIK, only osci.io server use IP v6 for the moment in the cage, so that's still a bit rough.

@duck have the details. And while we can't add a dhcp server as pointed by smooge, we can surely add a dhcpv6 one or start to announce IPv6 prefix on the vlan 190. Or start to get a shared dhcpv4/v6 server and remove the one from gnome ?

Now, the biggest problem I see is that we have IP v6 on the public vlan (vlan 190), but there is no NAT for IP v4 on that vlan, and that's a requirement it seems ?

@misc when we discussed ipv6 layout a couple of years ago with various tenants the EUI-64 was going to be problematic in a similar reason why we weren't allowing everyone to be on the same mgmt network. Any system can say what its mac address is and so could take over for someone else (or depending on the 'random' created mac address do so by accident). So it was decided that ipv6 ranges would be given out by OSAS team for clients to use so that they didn't have to worry about this. I got a set of ips from jbrooks to use

download-cc-rdu01 IN    AAAA  2620:52:3:1:dead:beef:cafe:fed1
pagure01        IN    AAAA  2620:52:3:1:dead:beef:cafe:fed5
pagure02        IN    AAAA  2620:52:3:1:dead:beef:cafe:fed8
pagure-stg01    IN    AAAA  2620:52:3:1:dead:beef:cafe:fed3
proxy03         IN    AAAA  2620:52:3:1:dead:beef:cafe:fed6
proxy14         IN    AAAA  2620:52:3:1:dead:beef:cafe:fed7

At some point that got forgotten and a different method was being used. I don't think our ips being used will be some systems mac address but I am not sure. If we are using the self-assigned ip addresses then we need to redo the Fedora ones.

Does what I say sound familiar?

yeah, looking at the internal list, I see that's not how we gave IP v6 to GNOME, so osci.io docs (derived from existing practice) are maybe misleading.

I guess it might be time to call @jasonbrooks on this ticket to he can assign something like he did for Fedora.

For the OSCI, and tenants we manage, we are using the automatically generated suffix. It is not anymore a simple MAC-based value, but use a method defined in RFC7217. Since I'm not going to reimplement how it works for our process the doc simply says to use the suffix that comes automatically on the link-local suffix. This method ensure we have no conflict whatever the MAC address is.

So yes at some later point it was discussed about assigning ranges but we already had many systems deployed with IPv6 and I never changed our process. Uniformizing would be best but that means work and disruption of service. I guess changing on our side could be associated with a reboot campaign. We would need to adapt our Ansible rules to setup the right IP in the network_setup role too.

but there is no NAT for IP v4 on that vlan, and that's a requirement it seems ?

We need the IPv4 for ephemeral VMs started on the vmhost... Initially I thought that
libvirt would do the NAT translation there for the VMs, and that we'd be able to assign
public ipv6 to those VMs somehow using some script (I don't know if there's some
mechanism in libvirt's dhcp6 actually, or if we could use dhcp6 from the network
and dhcp4 from host).

Of course if we had NAT on the network, it would be much easier because the
VMs in libvirt would use bridged networking and we wouldn't have to take "manual"
care of networking at all.

Sorry, we don't have a concrete plan. So depending on what is possible, we'll have
to adapt.

To get NAT we would need to move the host to a private VLAN. that's possible but I do not have the full picture to see if that would be appropriate.

Misc?

Do you even need ipv4 addresses? Can the vm's just be ipv6 only?

I would think that would work... (but I haven't tried it)

People are used to build their packages against custom external repositories that are usually ipv4 only :-/

Hum, yeah. :(

How about 2 interfaces on the vm's... one ipv4 only and on a nat libvirt network using 192.168 or whatever, and the second one ipv6 only using a bridge... ?

That could work, good idea. We wouldn't have to care about manual IP assignment at all.

It that's a private vlan, you can set your own dhcp.

Another solution is to get a nat64 gateway (tayga is packaged).

I thought there's already some DHCP server that is able to do this for us.
We didn't expect the lab is so limited so it needs to be rethought next
time :(

I think we'll prefer to configure libvirt for doing IPv4 NAT and somehow
assign IPv6. I hope it will be possible. Should be easier than
dedicating one VM for doing DHCP (or OpenVPN as proposed previously).
But thanks for the idea, we'll fall-back there when needed.

So is there some IPv6 range that we could start experimenting with?

So is there some IPv6 range that we could start experimenting with?

I mean - I think we need to get ipv6 "prefix" assigned?

@misc @smooge @jasonbrooks Can one of you allocate a /56 or whatever for this?

We are still discussing how we want to assign them.

Ok so since copr would be in the fedora range (from the network point of view), and Fedora is already using 2620:52:3:1:dead:beef:cafe/112, I think it would be simpler to just use that range for copr as well (if I am not wrong, that's a 65000 IP range, so there is surely enough for 40 there).

I am still trying to figure a scheme to split the range that wouldn't create headaches for others, but at least, this ticket shouldn't be blocked on that.

I have reassigned various public ips which would have been used for cloud.

copr-builder-01          IN     AAAA    2620:52:3:1:dead:beef:cafe:c101
copr-builder-01             IN     A     8.43.85.40
copr-builder-02          IN     AAAA    2620:52:3:1:dead:beef:cafe:c102
copr-builder-02             IN     A     8.43.85.41
copr-builder-03          IN     AAAA    2620:52:3:1:dead:beef:cafe:c103
copr-builder-03             IN     A     8.43.85.42
copr-builder-04          IN     AAAA    2620:52:3:1:dead:beef:cafe:c104
copr-builder-04             IN     A     8.43.85.43
copr-builder-05          IN     AAAA    2620:52:3:1:dead:beef:cafe:c105
copr-builder-05             IN     A     8.43.85.44
copr-builder-06          IN     AAAA    2620:52:3:1:dead:beef:cafe:c106
copr-builder-06             IN     A     8.43.85.45
copr-builder-07          IN     AAAA    2620:52:3:1:dead:beef:cafe:c107
copr-builder-07             IN     A     8.43.85.46
copr-builder-08          IN     AAAA    2620:52:3:1:dead:beef:cafe:c108
copr-builder-08             IN     A     8.43.85.47
copr-builder-09          IN     AAAA    2620:52:3:1:dead:beef:cafe:c109
copr-builder-09             IN     A     8.43.85.48
copr-builder-10          IN     AAAA    2620:52:3:1:dead:beef:cafe:c110
copr-builder-10             IN     A     8.43.85.51
vmhost-x86-copr01        IN     A     8.43.85.57
vmhost-x86-copr01        IN     AAAA  2620:52:3:1:dead:beef:cafe:c001
vmhost-x86-copr02        IN     A     8.43.85.58
vmhost-x86-copr02        IN     AAAA  2620:52:3:1:dead:beef:cafe:c002
vmhost-x86-copr03        IN     A     8.43.85.59
vmhost-x86-copr03        IN     AAAA  2620:52:3:1:dead:beef:cafe:c003
vmhost-x86-copr04        IN     A     8.43.85.60
vmhost-x86-copr04        IN     AAAA  2620:52:3:1:dead:beef:cafe:c004

For the other 30 64 ips you can use :c111 to c140. The vmhost-x86-copr boxes should be ready for you to use also.

Metadata Update from @smooge:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

10 days ago

Nice, thank you! Btw., do we have to use the copr-builder-XX hostnames?
The VMs for copr builders are will be removed and started from scratch all
the time... (and 10 pcs wouldn't be enough). Can we drop those records
ourselves?

These are just the names I put in the rdu-cc.fedoraproject.org dns file in Fedora Infrastructure DNS files. Change those to what you need them to be but I only have 10 ipv4 I can give you..

Those ips and such will change in about 4 or 5 months because red hat will be moving the ipv4 and ipv6

Login to comment on this ticket.

Metadata
Boards 1
ops Status: Backlog