NOTE
If your issue is for security or deals with sensitive info please mark it as private using the checkbox below.
There is a plan to support RISC-V architecture in Fedora. For this we need to setup VM(s) and prepare the infrastructure/releng to support this architecture as well.
No specific date yet
This is waiting for hardware to be available.
Metadata Update from @kevin: - Issue tagged with: blocked
Metadata Update from @t0xic0der: - Issue assigned to t0xic0der
First time I've seen this bug ...
Once thing we do really need which is not dependent on hardware availability is a unified Koji instance, hosted by Fedora, and connected to FAS. We currently have two externally hosted Koji instances which are not connected to FAS.
http://fedora.riscv.rocks/koji/ http://openkoji.iscas.ac.cn/koji/
We already have plenty of RISC-V builders (mix of VF2, HiFive Unmatched, and qemu) which could be connected to this.
So, there I think is a lot of confusion around people using 'hardware' without saying what exactly they are talking about. ;)
My understanding of things:
Once the (x86_64 and storage) hardware shows up we can stand up a hub/db/composer, I guess we should revisit the builder plans.
CC: @smilner
The x86_64 hardware also might include space to run risc-v vm's for builders (but perhaps we now don't want to do that in the end?)
qemu is really slow so I wouldn't bother with this one. Between David and the folks in China we have a huge pile of real RISC-V machines we can connect, and we'll get even more in the next few months.
@kevin @rjones do we foresee any issues with builders being far away from a dedicated Koji instance/scheduler?
In the past Fedora/RISCV Koji used to be in Fremont, US while majority of builders were in Europe. I have never found any major issue with the distance. To my knowledge there shouldn't be anything latency sensitive (in milliseconds). The only limit is basically your bandwidth / line for external users. That also could be improved with using local cache. Richard did something like some time ago for his boards IIRC.
What David says basically. It's kind of amazing that it works to be honest as I don't think Koji was designed with this in mind.
Mock passes http_proxy, ftp_proxy, https_proxy, no_proxy variables from user environment. Thus it's designed to do it.
http_proxy
ftp_proxy
https_proxy
no_proxy
I am not sure that the problems are with koji and building but where parts of the build system need to do things which need access to a central NFS directory (I forget what but know they are important) but requires same arch. This is where things have seen the highest problems with the s390x. At first various vpns were tried but in the end the only reliable system was via fuse_ssh because it can deal with the very high latencies, misordered packets and other things which can happen with long distance writes.
For the s390x it has been something like:
[fedora NFS netapp] <-> [site network eqt] <-> [internal firewall] <-> [internal long haul network connection] <-> [ internal firewall] <-> [s390x network eqt] <-> [s390x dedicated boxes]
Any of those can cause problems (latency, bandwidth blockage, packet problems, etc) with transmission or may need additional IT resources to debug.
While I think that CN or EU may not be a problem with builds, the NFS sections are probably the parts that would be best to be close to the main server.
I think we may be talking about different issues since we've been running Koji in this configuration for years with relatively few issues. In our set up builders don't need access to NFS. AIUI they upload the finished artifacts over HTTPS back to the kojihub.
Of course they are then subject to network slowness/issues, but as noted that has not been too much of a problem to date.
There's 2 (at least that I can think of off the top of my head) reasons builders need a direct koji mount:
builders doing createrepos/newrepos need to have a read-only mount of the koji volume in order to do those. This could be accomplished with a local x86_64 vm or two that does those. It doesn't need to be the same arch as the repos it's making normally.
builders doing runroot tasks need to be able to mount the koji volume (rw) because they write results directly to the koji volume. This can be an issue here, but only when we start doing composes. These typically do need to be the same arch as the thing they are making. With s390x we use a sshfs mount. It's slow, but functional. Typically in primary koji we set builders with the rw mount to be 'compose' channel only, that is, they don't run normal jobs, only compose jobs (to avoid any chance of a build doing something with the koji volume). So, once we are doing composes we could setup a builder or two with sshfs mount. ;( Or we could possibly emulate in a x86_64 vm for this part, or we could look at adding some small number of riscv SOCs at the datacenter just for this.
Do note that current setup is entirely talking about a 'secondary' hub to help focus and coordinate efforts. Once we try and move the arch into primary things are different. There we definitely do want to control all hosts that do builds, ideally have them locally to avoid network issues and such, etc.
One final note... koji upstream changed the scheduler in 1.34.0. It used to be that builders connected to the hub and asked for tasks. Now in 1.34.0, the hub assigns things more directly. I am not sure how this might affect a deployment with builders accross the network, but our s390x resources have been fine with it.
OK. I will try to make a short description on what we do now:
x86_64
maxjobs
kojid.conf
newRepo
createrepo
build
riscv64
maxjobs=1
buildSRPMFromSCM
createImage
rpm
binfmt_misc
TL;DR there will be some x86_64 machines or/and libvirt VMs (riscv64) in this.
Just an update here:
We got a new x86_64 server (finally), and it's delivered and at the datacenter. Unfortunately, on-site work was delayed for various reasons. The current plan (as far as I know it) is for people to be there next week. Hopefully we can get the server racked and setup on the network and installed soon after. As soon as it's installed I can work on setting up a new hub/db/composer/builders. We can then sort out credentials for builders and work on migration. Sorry this entire thing has taken so long. ;(
Another update:
Last week we got the server all racked up and I can reach it's mgmt interface now.
I just filed a ticket asking for networking/port assignments. As soon as thats sorted we can install it and start creating vm's on it. We probibly want to write up a plan / outline of next steps then.
Metadata Update from @zlopez: - Issue untagged with: blocked
Update this week:
server networking is setup and I did a initial install.
Update:
vmhost-x86-riscv01 is all ansibled and working.
I now need to add ansible config for:
riscv-koji01 db-riscv-koji01 compose-riscv01 buildvm-x86-riscv01 buildvm-x86-riscv02
Hope to do that in the coming week. Also plan to add a sysadmin-riscv group and add some folks to it to have access to these machines/playbooks for them. After those are up, will need to add config to reach them from proxies. Then we need to figure out auth for builders ( hopefully keytabs ) Then we will need to figure out what to import in and start building.
@kevin What is the update on this?
I'm planning to work on this very soon (this week hopefully).
I've already setup the db server. Need to next do the hub, which will require some config changes in ansible to handle secondary again...
Metadata Update from @t0xic0der: - Assignee reset
I am dropping my assignment on this as I am not working on it.
Let me know if I can help you in any way though, @kevin.
Log in to comment on this ticket.