#9027 bodhi can't copy flatpaks to stable registry
Closed: Fixed 3 years ago by kevin. Opened 3 years ago by kevin.

Our updates flow is blocked by a flatpak stable push:

https://bodhi.fedoraproject.org/composes/F32F/stable

As far as I understand it, all bodhi does in this case is copy the flatpak from the candidate registry to the normal one and update any updates/etc.

The copy is failing and when I run that command that bodhi is running manually I get:

[root@bodhi-backend01 ~][PROD]# time sudo -u apache /usr/bin/bodhi-skopeo-lite copy docker://candidate-registry.fedoraproject.o
rg/0ad:master-3220200604091941.1 docker://registry.fedoraproject.org/0ad:master-3220200604091941.1
INFO:skopeo-lite:candidate-registry.fedoraproject.org: Downloading /tmp/tmpwwd7l53_/blobs/sha256/5d42466e4948499efdb1268c6787e6
0d7932c68b97e32f22211b8dd62328567d (size=47611)                
INFO:skopeo-lite:candidate-registry.fedoraproject.org: Downloading /tmp/tmpwwd7l53_/blobs/sha256/3744f18ab4c680b1164a15b1242dea
36ee304efac61a3881dfae02eff0dcaa38 (size=937468406)                                                                            
INFO:skopeo-lite:registry.fedoraproject.org: Storing manifest as sha256:ba3e5383d9e714d15da8193967bec116d5b2c50f71b487d9b52b2e7
4cf02ae7e                                                                                                                      
Traceback (most recent call last):                             
  File "/usr/bin/bodhi-skopeo-lite", line 11, in <module>      
    load_entry_point('bodhi-server==5.1.1', 'console_scripts', 'bodhi-skopeo-lite')()
  File "/usr/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)                          
  File "/usr/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)                                      
  File "/usr/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)                       
  File "/usr/lib/python3.8/site-packages/bodhi/server/scripts/skopeo_lite.py", line 779, in copy
    Copier(tmp, dest.get_endpoint()).copy()                                                                                    
  File "/usr/lib/python3.8/site-packages/bodhi/server/scripts/skopeo_lite.py", line 737, in copy
    self._copy_manifest(referenced)     
  File "/usr/lib/python3.8/site-packages/bodhi/server/scripts/skopeo_lite.py", line 725, in _copy_manifest                     
    self.dest.write_manifest(info, toplevel=toplevel)                                                                          
  File "/usr/lib/python3.8/site-packages/bodhi/server/scripts/skopeo_lite.py", line 656, in write_manifest
    response.raise_for_status()         
  File "/usr/lib/python3.8/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)                                                                             
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: https://registry.fedoraproject.org/v2/0ad/manifes
ts/sha256:ba3e5383d9e714d15da8193967bec116d5b2c50f71b487d9b52b2e74cf02ae7e                                                     

real    2m22.593s                                                                                                              
user    0m4.199s                                               
sys     0m2.885s

The registry box is fedora32 now in iad2, where it was fedora 30 in phx2. But the version of docker-distributuon seems pretty much the same. :(

@cverna @mohanboddu @otaylor @kalev

Can any of you see whats going on here? We really need to unblock our flow of updates...


I'll note if you go to that url it gets a 503 on, it says:

"OCI manifest found, but accept header does not support OCI manifests"

old registry: docker-distribution-2.6.2-9.git48294d9.fc30.x86_64
new registry: docker-distribution-2.6.2-11.git48294d9.fc32.x86_64

I'll note if you go to that url it gets a 503 on, it says:
"OCI manifest found, but accept header does not support OCI manifests"
old registry: docker-distribution-2.6.2-9.git48294d9.fc30.x86_64
new registry: docker-distribution-2.6.2-11.git48294d9.fc32.x86_64

That's because you need to send a request with an appropriate Accept header (see https://github.com/fedora-infra/bodhi/blob/ebb886e7392e64fd046fd638c62035e7dd21d956/bodhi/server/scripts/skopeo_lite.py#L333).
Doing so from my PC works fine, so maybe there's some DNS problem connecting Bodhi server to the registry box?

it appears that the problem is possibly that the image is already on the destination (public registry) and skopeo-light doesn't handle that. I'm not sure why a 503 is being generated by the HTTPD proxy rather than passing back the actual status code / error message which is probably more informative.

I don't have much of an idea why the image would already be there - maybe a previous container push failed because of relocation stuff? Worth looking in the logs to see what the first failure on this push was - it might be different.

I probably won't have time to investigate further or come up with a fix for skopeo-lite until Monday. Is it possible to unqueue this one update for 0ad and see what happens with the next?

ok, that was a saga. ;) After many hours it's working and I was able to complete the push.

Along the way:

  • The 503 errors were due to firewall / proxy issues in the new datacenter. First on candidate registry, then on the final one. Got all those worked around until we can properly fix them next week by routing over our vpn instead of trying to reach those hosts directly.

  • Then, I saw that the flatpak was already copied over, so I thought: why not delete it from the registry and let bodhi copy it again? First I had to also get oci-registry02 working, as all the deletes and writes go to it instead of 01. After that Ran into a side problem with docker-distribution having delete allowed, but disallowing delete. Finally I realized the config was put in place, but docker-distribution was never restarted. Restarted it and was able to delete that from registry.

  • Then, on running bodhi it could now not find the content in the candidate registry? Turns out somhow the candidate registry had /srv/oci_registry mounted. IT HAD THE SAME CONTENT AS PROD! I umounted that and found that on our old candidate registry it was just local disk.

  • So, luckily I was still able to get into our old datacenter and copy all the old content off the old candiate registry on to the new one. With an aside of the instance having too small a disk and I had to resize it to get the old content to fit.

  • Finally the update push worked as expected. Whew.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Thanks, @kevin, for figuring this out!

Login to comment on this ticket.

Metadata