#803 Performance of the aarch64 CentOS CI machines
Closed: Fixed 2 years ago by arrfab. Opened 2 years ago by mrc0mmand.

FWIW, this is more like a question than an actual issue.

Recently I've started playing with the alt-arch machines in the CentOS CI pool. The ppc64le machines work flawlessly and I was able to utilize them in one of our upstream systemd jobs. While trying to do the same with the aarch64 ones, I noticed that they're noticeably slower - to such degree, that some of our tests keep timing out even if I bump their timeouts by 10x.

I'm not sure if this is expected, and the HW behind aarch64 nodes is quite slower, or there's some other issue. Since I can't check the CPU frequencies directly (as neither the ppc64le nodes, nor the aarch64 ones provide that information in /proc/cpuinfo or under /sys/), I at least ran sysbench, which confirms my suspicions:

# uname -a
Linux n2.p8h1.ci.centos.org 4.18.0-394.el8.ppc64le #1 SMP Tue May 31 16:42:06 UTC 2022 ppc64le ppc64le ppc64le GNU/Linux
# systemd-detect-virt
kvm
# sysbench cpu run
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:  3042.73

General statistics:
    total time:                          10.0004s
    total number of events:              30436

Latency (ms):
         min:                                    0.33
         avg:                                    0.33
         max:                                    0.36
         95th percentile:                        0.33
         sum:                                 9995.23

Threads fairness:
    events (avg/stddev):           30436.0000/0.00
    execution time (avg/stddev):   9.9952/0.00
# uname -a
Linux n6.aah3.ci.centos.org 4.18.0-383.el8.aarch64 #1 SMP Wed Apr 20 15:39:57 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
# systemd-detect-virt
kvm
# sysbench cpu run
sysbench 1.0.20 (using system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 1
Initializing random number generator from current time


Prime numbers limit: 10000

Initializing worker threads...

Threads started!

CPU speed:
    events per second:   643.14

General statistics:
    total time:                          10.0008s
    total number of events:              6436

Latency (ms):
         min:                                    1.54
         avg:                                    1.55
         max:                                    1.73
         95th percentile:                        1.58
         sum:                                 9992.88

Threads fairness:
    events (avg/stddev):           6436.0000/0.00
    execution time (avg/stddev):   9.9929/0.00

Here the sysbench results show that the aarch64 machine is ~5x slower than the ppc64le. So, to reiterate the original question - is this expected or the aarch64 machines are somehow misconfigured?


Metadata Update from @arrfab:
- Issue tagged with: centos-ci-infra

2 years ago

We are aware that the current aarch64 situation is far from ideal, as it's running VMs on top of (over) used ThunderX (Gen 1, so the first one).
We have the replacement infra but we should announce it, so closing for now

Metadata Update from @arrfab:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Login to comment on this ticket.

Metadata
Boards 1
CentOS CI Infra Status: Backlog