Issue #8071: Intermittent non-connectivity between GCE and registry.fedoraproject.org - fedora-infrastructure

fedora-infrastructure

#8071 Intermittent non-connectivity between GCE and registry.fedoraproject.org

Closed: Insufficient data 4 years ago by cevich. Opened 4 years ago by cevich.

We run a CI in VMs running from the us-central1-a zone of GCE. Starting the week of June 28th through now, we've been seeing errors pulling from registry.fedoraproject.org/fedora-minimal:latest whereas (in the same VM and context) we are able to pull from other registries. Example a relevant section of output from our CI.

Unfortunately that "SHA doesn't match error" is the best we can manage from the job itself. But, I have access to manually spin up one of these VMs to poke and prod whatever lower-level bits would be helpful to debugging.

smooge commented 4 years ago

Could you define what GCE is? [There are several G.* Cloud Environments it could be.] Second could you put a traceroute from there to our proxies?

cevich commented 4 years ago

We run jobs in both containers (GKE) and GCE (Google-compute-engine). I don't recall seeing failures in GKE, only from our GCE project. The failures only seem to happen when we're not looking.

I have a script that lets me create a 99% automation-identical VM, but (of course) I have no problem pulling registry.fedoraproject.org/fedora-minimal:latest from there :confounded: Yes I can get you some traceroute data, and whatever else you need. I'm also trying my darnedest to eliminate our testing|software layers from the equation...

cevich commented 4 years ago

# mtr -4 --report --show-ips --report-cycles=50 registry.fedoraproject.org
Start: 2019-08-05T12:17:52-0400
HOST: cevich-fedora-30-libpod-547 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 209.85.241.122             0.0%    50   10.9  11.0  10.7  12.8   0.5
  2.|-- 108.170.243.231            0.0%    50   12.7  12.2  10.4  37.8   4.8
  3.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  4.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  5.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  6.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  7.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  8.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  9.|-- et-3-3-0.582.rtsw.rale.ne  0.0%    50   32.1  32.2  31.9  34.5   0.6
 10.|-- 198.71.47.222              0.0%    50   32.9  32.8  32.4  34.7   0.4
 11.|-- 128.109.25.14              0.0%    50   33.4  33.8  33.2  48.0   2.2
 12.|-- 8.43.84.1                  0.0%    50   66.0  74.7  53.5 174.4  29.9
 13.|-- 8.43.84.3                  0.0%    50   33.5  33.7  33.4  38.4   0.7
 14.|-- 8.43.84.4                  0.0%    50   52.9  51.5  41.4 112.9  11.1
 15.|-- ip-8-43-87-254 (8.43.87.2  0.0%    50  156.4  42.2  34.1 156.4  17.5
 16.|-- proxy14.fedoraproject.org  0.0%    50   33.6  33.7  33.5  34.7   0.2

Edited 4 years ago by cevich

cevich commented 4 years ago

# mtr -4 --report --show-ips --report-cycles=50 registry.fedoraproject.org
Start: 2019-08-05T12:20:30-0400
HOST: cevich-fedora-30-libpod-547 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 216.239.59.150             0.0%    50   10.9  10.8  10.6  11.9   0.2
  2.|-- 108.170.244.6              0.0%    50   10.5  10.9  10.4  23.0   1.8
  3.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  4.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  5.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  6.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  7.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  8.|-- ???                       100.0    50    0.0   0.0   0.0   0.0   0.0
  9.|-- et-3-3-0.582.rtsw.rale.ne  0.0%    50   32.0  32.1  31.8  34.2   0.5
 10.|-- 198.71.47.222              0.0%    50   32.7  33.1  32.3  50.6   2.5
 11.|-- 128.109.25.14              0.0%    50   33.3  33.5  33.2  34.2   0.3
 12.|-- 8.43.84.1                  0.0%    50   51.6  53.9  38.9 161.4  20.6
 13.|-- 8.43.84.3                  0.0%    50   33.5  33.4  33.4  33.7   0.1
 14.|-- 8.43.84.4                  0.0%    50   51.7  54.5  41.9 148.2  20.4
 15.|-- ip-8-43-87-254 (8.43.87.2  0.0%    50   92.5  55.2  36.0 157.0  27.2
 16.|-- proxy03.fedoraproject.org  0.0%    50   33.7  33.9  33.6  36.6   0.7

cevich commented 4 years ago

# mtr -4 --report --show-ips --report-cycles=50 registry.fedoraproject.org
Start: 2019-08-05T12:22:06-0400
HOST: cevich-fedora-30-libpod-547 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 209.85.252.47              0.0%    50   25.2  25.5  25.2  33.8   1.2
  2.|-- 216.239.48.94              0.0%    50   25.1  25.8  25.1  38.4   2.4
  3.|-- 108.170.246.7              0.0%    50   25.4  25.8  25.2  40.6   2.2
  4.|-- 198.86.53.238              0.0%    50   31.7  32.2  31.6  43.7   1.7
  5.|-- ws-gw-to-rtp-gw.ncren.net  0.0%    50   32.7  32.7  32.4  33.3   0.2
  6.|-- uncphillips-to-ws-gw.ncre  0.0%    50   36.8  36.5  36.4  36.9   0.1
  7.|-- core-p-v1213.net.unc.edu   0.0%    50   36.7  36.7  36.5  37.0   0.1
  8.|-- 152.2.255.166              0.0%    50   36.3  36.3  36.2  36.4   0.0
  9.|-- vm18.fedora.ibiblio.org (  0.0%    50   36.3  36.4  36.2  39.1   0.4

cevich commented 4 years ago

Hmmm I see there are quite a few addresses on both sides here (outbound and destinations). Trying each destination from dns...

cevich commented 4 years ago

...okay, this is a bit easier to look at:

cevich commented 4 years ago

So slight loss on proxy13-rdu02.fedoraproje... and proxy10.fedoraproject.org otherwise I'm assuming there's nothing you guys can do for zayo and...dang, seems I should have used --no-dns...

cevich commented 4 years ago

...another run w/o hostnames:

cevich commented 4 years ago

This time slight losses on proxy10.fedoraproject.org (209.132.181.15) and again on proxy13-rdu02.fedoraproject.org (209.132.190.2).

What other data can I provide?

cevich commented 4 years ago

Update: Running mtr again this morning, I no-longer see the 2-4% drops from the registry server ends. Also, I was mistaken with the original example log I linked in the description, that's a totally unrelated/different problem.

The suspected-networking issue is this one, which appears to be more rare. Last occurring at 2019-08-06T00:38:42+00:00

smooge commented 4 years ago

So the error you are posting is about getting to Error determining manifest MIME type for docker://registry.access.redhat.com/fedora-minimal:latest:

That is redhat.com and not fedoraproject.org.

Metadata Update from @smooge:
- Issue assigned to smooge

4 years ago

Metadata Update from @smooge:
- Issue priority set to: Waiting on Reporter (was: Needs Review)

4 years ago

cevich commented 4 years ago

Oh good catch, that explains the final error. Just prior to that though we see the error from Trying to pull registry.fedoraproject.org/fedora-minimal:latest.

However, we have a known problem in podman saving images, the team is addressing this. At this point I'm not seeing much evidence pointing at networking or the registry servers anymore. But I'm keeping my eye on the situation and will update this issue accordingly.

cevich commented 4 years ago

I think we can close this. I'll open a new issue if/when I get new evidence.

Metadata Update from @cevich:
- Issue close_status updated to: Insufficient data
- Issue status updated to: Closed (was: Open)

4 years ago

Metadata

Assignee

smooge

Tags

None

Blocking

None

Depending on

None

Priority

Waiting on Reporter

Attachments 2

mtr.txt

Attached 4 years ago View Comment

more_mtr.txt

Attached 4 years ago View Comment

fedora-infrastructure

Source Code

#8071 Intermittent non-connectivity between GCE and registry.fedoraproject.org Closed: Insufficient data 4 years ago by cevich. Opened 4 years ago by cevich.

Metadata

Attachments 2

#8071 Intermittent non-connectivity between GCE and registry.fedoraproject.org

Closed: Insufficient data 4 years ago by cevich. Opened 4 years ago by cevich.