#7435 Koji stage fails with "HTTPError: HTTP Error 503: Backend fetch failed"
Closed: Fixed 7 months ago by mizdebsk. Opened 9 months ago by mprahl.

  • Describe what you need us to do:

It seems to be hit or miss when my build fails with the following traceback:

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/koji/daemon.py", line 1244, in runTask
    response = (handler.run(),)
  File "/usr/lib/python2.7/site-packages/koji/tasks.py", line 307, in run
    return koji.util.call_with_argcheck(self.handler, self.params, self.opts)
  File "/usr/lib/python2.7/site-packages/koji/util.py", line 216, in call_with_argcheck
    return func(*args, **kwargs)
  File "/usr/sbin/kojid", line 925, in handler
    h = self.readSRPMHeader(srpm)
  File "/usr/sbin/kojid", line 1010, in readSRPMHeader
    fo = koji.openRemoteFile(relpath, **opts)
  File "/usr/lib/python2.7/site-packages/koji/__init__.py", line 1605, in openRemoteFile
    src = six.moves.urllib.request.urlopen(url)
  File "/usr/lib64/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib64/python2.7/urllib2.py", line 435, in open
    response = meth(req, response)
  File "/usr/lib64/python2.7/urllib2.py", line 548, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib64/python2.7/urllib2.py", line 473, in error
    return self._call_chain(*args)
  File "/usr/lib64/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/lib64/python2.7/urllib2.py", line 556, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 503: Backend fetch failed

Here is an example of such a build:
https://koji.stg.fedoraproject.org/koji/taskinfo?taskID=90004746

  • When do you need this? (YYYY/MM/DD)

No specific date, but it helps my testing of MBS in stage before deploying to production.

  • When is this no longer needed or useful? (YYYY/MM/DD)

N/A

  • If we cannot complete your request, what is the impact?

I cannot test MBS in stage.


Metadata Update from @mizdebsk:
- Issue assigned to mizdebsk
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: koji, staging

9 months ago

I've put a hotfix in place. Please let me know if the issue is resolved.

Metadata Update from @mizdebsk:
- Issue priority set to: Waiting on Reporter (was: Waiting on Assignee)

9 months ago

Metadata Update from @mprahl:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

9 months ago

Thanks for confirmation. Since the underlying issue is not yet fixed, I will reopen the ticket. I will update the ticket with more details later.

Metadata Update from @mizdebsk:
- Issue priority set to: Waiting on Assignee (was: Waiting on Reporter)
- Issue status updated to: Open (was: Closed)

9 months ago

In November 2018 I set up Varnish on buildvm-s390x-01.stg, primarily to better match production setup and to test Varnish deployment on Fedora 29, but also to possibly reduce inter-site traffic between BOS and PHX2.

Currently Varnish on buildvm-s390x-01.stg in BOS is configured to use koji01.stg.phx2 as backend. This does not work as RHIT firewall seems to be blocking inter-site traffic from varnish to its backend and leads to "Error 503: Backend fetch failed".

My proposed solution is to request RHIT to allow TCP connections from 10.16.0.25 (buildvm-s390x-01.stg.s390.fedoraproject.org) to 10.5.128.139 (koji01.stg.phx2.fedoraproject.org) on port 80.

Alternative solutions:

  • make Varnish use proxy01.stg as backend; since proxies are HTTPS-only (all plain HTTP requests are redirected to HTTPS) and Varnish can't use TLS, this would require some additional changes, eg:
    • make proxy01.stg stop redirecting kojipkgs to https, at least for internal connections
    • introduce yet another proxy (eg. httpd running on buildvm-s390x-01.stg) to which Varnish could talk over plain HTTP and which would connect to proxy01.stg using TLS
  • revert Varnish enablement

Metadata Update from @mizdebsk:
- Issue priority set to: Needs Review (was: Waiting on Assignee)

8 months ago

Just to clarify because this will cause confusion in IT if listed this way.. 10.16.0 is NOT in RDU2. It is in BOS. The 'route' between PHX2 and BOS is different from PHX2 and RDU and requires more fixes.

@smooge Of course you are right, I'll edit my comment.

@smooge @mizdebsk has a RHIT ticket been filed on this yet?

I thought mizdebsk had one opened so I didn't.

@kevin No, I did not open RHIT ticket. Last time I did that they didn't like "developer requesting firewall changes".

You are right.. I should have seen that earlier. I have initiated the ticket and contacted Red Hat IT for it.

Firewall ports should be open now.

Metadata Update from @mizdebsk:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

7 months ago

This should be fixed now.

Metadata Update from @mizdebsk:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

7 months ago

Login to comment on this ticket.

Metadata