#2115 Failing to release @rubygems/rubygems builders
Closed: Fixed 2 years ago by praiskup. Opened 2 years ago by frostyx.

There is a lot of pending @rubygems/rubygems builds but they are not being processed. While investigating the issue I found this (may be related or not, either way should be fixed):

bash-5.1$ resalloc-maint resource-info 523810
{
    "id": 523810,
    "name": "copr_hv_x86_64_04_prod_00523810_20220306_062146",
    "user": null,
    "data": "2620:52:3:1:dead:beef:cafe:c1ce",
    "pool": "copr_hv_x86_64_04_prod",
    "state": "RELEASING",
    "check_last_time": 1646549008.0361989,
    "check_failed_count": 0,
    "ticket_id": null,
    "sandbox": "@rubygems/rubygems--frostyx",
    "sandboxed_since": 1646548776.0429494,
    "released_at": 1646549081.2394304,
    "releases_counter": 1
}
bash-5.1$ resalloc-maint resource-info 523811
{
    "id": 523811,
    "name": "copr_hv_x86_64_04_prod_00523811_20220306_062218",
    "user": null,
    "data": "2620:52:3:1:dead:beef:cafe:c1d2",
    "pool": "copr_hv_x86_64_04_prod",
    "state": "RELEASING",
    "check_last_time": 1646549030.4367988,
    "check_failed_count": 0,
    "ticket_id": null,
    "sandbox": "@rubygems/rubygems--frostyx",
    "sandboxed_since": 1646548774.8754423,
    "released_at": 1646549082.7026625,
    "releases_counter": 1
}
bash-5.1$ resalloc-maint resource-info 559630
{
    "id": 559630,
    "name": "copr_hv_x86_64_01_prod_00559630_20220310_070245",
    "user": null,
    "data": "2620:52:3:1:dead:beef:cafe:c112",
    "pool": "copr_hv_x86_64_01_prod",
    "state": "RELEASING",
    "check_last_time": 1646896060.4108584,
    "check_failed_count": 0,
    "ticket_id": null,
    "sandbox": "@rubygems/rubygems--frostyx",
    "sandboxed_since": 1646896066.0262077,
    "released_at": 1646896075.105835,
    "releases_counter": 1
}

I found them by

for i in $(resalloc-maint resource-list | cut -d' ' -f1); do resalloc-maint resource-info $i; done |grep -C 10 rubygems

There is some problem because the released_at timestamp is 6 days ago. It is still possible to ssh to the machines:

$ ssh root@2620:52:3:1:dead:beef:cafe:c1ce
Warning: Permanently added '2620:52:3:1:dead:beef:cafe:c1ce' (ED25519) to the list of known hosts.
Last login: Sun Mar  6 06:23:59 2022 from 2600:1f18:8ee:ae00:d553:8ed5:d8b6:9f83
[systemd]
Failed Units: 1
  systemd-zram-setup@zram0.service

[root@copr-hv-x86-64-04-prod-00523810-20220306-062146 ~]# uptime
 10:07:09 up 6 days,  3:44,  1 user,  load average: 0.00, 0.00, 0.00
[root@copr-hv-x86-64-04-prod-00523810-20220306-062146 ~]# 

Metadata Update from @praiskup:
- Issue assigned to praiskup

2 years ago

Unfortunately, it is not clear what is happening. I'd like to install two new
patches into Resalloc server so we can better analyze the cause.
I need a review: https://github.com/praiskup/resalloc/pull/83

Then we'll need some time till this reproduces again (please don't close).

Metadata Update from @praiskup:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

2 years ago

Log in to comment on this ticket.

Metadata