#460 Better handle of related service being down
Closed: Fixed 3 months ago by lholecek. Opened 4 years ago by pingou.

If you turn on the RemoteRule, greenwave will retrieve the gating.yaml from another source, which could be down or returning 500 for X or Y reasons.

In this situation, greenwave returns a 500 error without any explanations.

It would be nice if greenwave could handle this situation a bit better and return with an error code some error message as to why it cannot proceed as expected.

Here are the logs I've found server side:

2019-07-16 16:34:03 [pid    15] flask.app ERROR Exception on /api/v1.0/decision [POST]
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 711, in urlopen
    retries = retries.increment(method, url, response=response, _pool=self)
  File "/usr/lib/python3.7/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='src.stg.fedoraproject.org', port=443): Max retries exceeded with url: //rpms/python-arrow/raw/34f3b85b4be9d7c7fb4f8f7dd07a13d3e40b0da6/f/gating.yaml (Caused by ResponseError('too many 500 error responses'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "<decorator-gen-2>", line 2, in make_decision
  File "/usr/lib/python3.7/site-packages/prometheus_client/context_managers.py", line 23, in wrapped
    return func(*args, **kwargs)
  File "<decorator-gen-1>", line 2, in make_decision
  File "/usr/lib/python3.7/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/src/greenwave/utils.py", line 64, in wrapped
    return func(*args, **kwargs)
  File "/src/greenwave/api_v1.py", line 453, in make_decision
    results_retriever))
  File "/src/greenwave/policies.py", line 641, in check
    results_retriever)
  File "/src/greenwave/policies.py", line 404, in check
    policies = self._get_sub_policies(policy, subject_identifier)
  File "/src/greenwave/policies.py", line 379, in _get_sub_policies
    response = greenwave.resources.retrieve_yaml_remote_rule(rev, pkg_name, pkg_namespace)
  File "/src/greenwave/cache.py", line 17, in wrapper
    return decorator(fn)(*args)
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 1270, in decorate
    should_cache_fn)
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 864, in get_or_create
    async_creator) as value:
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 186, in __enter__
    return self._enter()
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 93, in _enter
    generated = self._enter_create(value, createdtime)
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 179, in _enter_create
    return self.creator()
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 831, in gen_value
    created_value = creator()
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 1266, in creator
    return fn(*arg, **kw)
  File "/src/greenwave/resources.py", line 175, in retrieve_yaml_remote_rule
    return _retrieve_yaml_remote_rule_web(rev, pkg_name, pkg_namespace)
  File "/src/greenwave/resources.py", line 193, in _retrieve_yaml_remote_rule_web
    timeout=60)
  File "/usr/lib/python3.7/site-packages/requests/sessions.py", line 524, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.7/site-packages/requests/sessions.py", line 637, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.7/site-packages/requests/adapters.py", line 507, in send
    raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='src.stg.fedoraproject.org', port=443): Max retries exceeded with url: //rpms/python-arrow/raw/34f3b85b4be9d7c7fb4f8f7dd07a13d3e40b0da6/f/gating.yaml (Caused by ResponseError('too many 500 error responses'))
2019-07-16 16:34:03 [pid    15] flask.app ERROR Unexpected server error: HTTPSConnectionPool(host='src.stg.fedoraproject.org', port=443): Max retries exceeded with url: //rpms/python-arrow/raw/34f3b85b4be9d7c7fb4f8f7dd07a13d3e40b0da6/f/gating.yaml (Caused by ResponseError('too many 500 error responses'))
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 731, in urlopen
    body_pos=body_pos, **response_kw)
  File "/usr/lib/python3.7/site-packages/urllib3/connectionpool.py", line 711, in urlopen
    retries = retries.increment(method, url, response=response, _pool=self)
  File "/usr/lib/python3.7/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='src.stg.fedoraproject.org', port=443): Max retries exceeded with url: //rpms/python-arrow/raw/34f3b85b4be9d7c7fb4f8f7dd07a13d3e40b0da6/f/gating.yaml (Caused by ResponseError('too many 500 error responses'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "<decorator-gen-2>", line 2, in make_decision
  File "/usr/lib/python3.7/site-packages/prometheus_client/context_managers.py", line 23, in wrapped
    return func(*args, **kwargs)
  File "<decorator-gen-1>", line 2, in make_decision
  File "/usr/lib/python3.7/site-packages/prometheus_client/context_managers.py", line 66, in wrapped
    return func(*args, **kwargs)
  File "/src/greenwave/utils.py", line 64, in wrapped
    return func(*args, **kwargs)
  File "/src/greenwave/api_v1.py", line 453, in make_decision
    results_retriever))
  File "/src/greenwave/policies.py", line 641, in check
    results_retriever)
  File "/src/greenwave/policies.py", line 404, in check
    policies = self._get_sub_policies(policy, subject_identifier)
  File "/src/greenwave/policies.py", line 379, in _get_sub_policies
    response = greenwave.resources.retrieve_yaml_remote_rule(rev, pkg_name, pkg_namespace)
  File "/src/greenwave/cache.py", line 17, in wrapper
    return decorator(fn)(*args)
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 1270, in decorate
    should_cache_fn)
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 864, in get_or_create
    async_creator) as value:
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 186, in __enter__
    return self._enter()
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 93, in _enter
    generated = self._enter_create(value, createdtime)
  File "/usr/lib/python3.7/site-packages/dogpile/lock.py", line 179, in _enter_create
    return self.creator()
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 831, in gen_value
    created_value = creator()
  File "/usr/lib/python3.7/site-packages/dogpile/cache/region.py", line 1266, in creator
    return fn(*arg, **kw)
  File "/src/greenwave/resources.py", line 175, in retrieve_yaml_remote_rule
    return _retrieve_yaml_remote_rule_web(rev, pkg_name, pkg_namespace)
  File "/src/greenwave/resources.py", line 193, in _retrieve_yaml_remote_rule_web
    timeout=60)
  File "/usr/lib/python3.7/site-packages/requests/sessions.py", line 524, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.7/site-packages/requests/sessions.py", line 637, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.7/site-packages/requests/adapters.py", line 507, in send
    raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='src.stg.fedoraproject.org', port=443): Max retries exceeded with url: //rpms/python-arrow/raw/34f3b85b4be9d7c7fb4f8f7dd07a13d3e40b0da6/f/gating.yaml (Caused by ResponseError('too many 500 error responses'))

This was broken by the new refactored code for retrying requests Greenwave does internally. We missed that this this can raise MaxRetryError exception (we already handle HTTPError).

We need tests for this.

Metadata Update from @lholecek:
- Issue tagged with: bug

4 years ago

I tried to quick fix this, but I ended up refactoring consumers instead. :)

https://pagure.io/greenwave/pull-request/461

The PR above probably should be merged before fixing this issue (to avoid some duplicate code).

Metadata Update from @yashn:
- Issue assigned to yashn

4 years ago

Metadata Update from @lholecek:
- Assignee reset

3 months ago

Metadata Update from @lholecek:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 months ago

Login to comment on this ticket.

Metadata
Related Pull Requests
  • #471 Merged 4 years ago