#2973 regression in wait_for_startup caused by ReadTimeout exception
Closed: fixed 6 years ago Opened 6 years ago by ftweedal.

Regression looked to have been introduced in
0a05eab2483e6248a3e14a97c214e531828cd9be Refactored wait_for_startup() (part 2).
I think we need to add requests.ReadTimeout alongside ConnectionError as a
retry-able exception.

2018-03-20 16:38:04 pkispawn    : INFO     ....... executing 'systemctl start pki-tomcatd@pki-tomcat.service'
2018-03-20 16:38:11 pkispawn    : INFO     ........... FIPS mode is enabled on this operating system.
2018-03-20 16:38:11 pkispawn    : INFO     ........... checking http://vm-210.abc.idm.lab.eng.brq.redhat.com:8080/ca
2018-03-20 16:38:12 pkispawn    : INFO     ........... waiting for server to start (1s)
2018-03-20 16:38:13 pkispawn    : INFO     ........... waiting for server to start (2s)
2018-03-20 16:38:29 pkispawn    : DEBUG    ....... Error Type: ReadTimeout
2018-03-20 16:38:29 pkispawn    : DEBUG    ....... Error Message: HTTPConnectionPool(host='vm-210.abc.idm.lab.eng.brq.redhat.com', port=8080): Read timed out. (read timeout=15)
2018-03-20 16:38:29 pkispawn    : DEBUG    .......   File "/usr/lib/python3.6/site-packages/pki/server/pkispawn.py", line 533, in main
    scriptlet.spawn(deployer)
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/scriptlets/configuration.py", line 1268, in spawn
    secure_connection=False,
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/pkihelper.py", line 1081, in wait_for_startup
    timeout=request_timeout,
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/pkihelper.py", line 1024, in get_instance_status
    response = client.get_status(timeout=timeout)
  File "/usr/lib/python3.6/site-packages/pki/system.py", line 298, in get_status
    timeout=timeout,
  File "/usr/lib/python3.6/site-packages/pki/client.py", line 46, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/pki/client.py", line 160, in get
    timeout=timeout,
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.6/site-packages/requests/adapters.py", line 521, in send
    raise ReadTimeout(e, request=request)

Metadata Update from @mharmsen:
- Custom field component adjusted to None
- Custom field feature adjusted to None
- Custom field origin adjusted to None
- Custom field proposedmilestone adjusted to None
- Custom field proposedpriority adjusted to None
- Custom field reviewer adjusted to None
- Custom field type adjusted to None
- Custom field version adjusted to None
- Issue set to the milestone: 0.0 NEEDS_TRIAGE

6 years ago

Metadata Update from @edewata:
- Issue priority set to: blocker
- Issue set to the milestone: 10.6 (was: 0.0 NEEDS_TRIAGE)

6 years ago

It makes sense to catch ReadTimeout, too. A ReadTimeout occurs, when TCP handshake is successful, but the actual socket read operation times out because the server doesn't respond fast enough. I think it happens when the server has opened the port for listening and it either is not accepting connection yet or not responding fast enough.

I suggest you define RETRYABLE_EXCEPTIONS = (requests.exceptions.Timeout, requests.exceptions.ConnectionError) and use the tuple instead of ConnectionError. Python allows you to catch a tuple of exceptions. Timeout also catches ConnectionTimeout.

ConnectionError` is caught inbase/server/python/pki/server/deployment/pkihelper.pyand multiple times inbase/server/python/pki/server/pkispawn.py``.

Fixed in master: 353de17f540a12107509f4b06f207d5d000ff4dc

Metadata Update from @ftweedal:
- Assignee reset

6 years ago

Metadata Update from @ftweedal:
- Issue assigned to cheimes
- Issue close_status updated to: fixed

6 years ago

Metadata Update from @edewata:
- Issue set to the milestone: 10.6.0 (was: 10.6)

6 years ago

Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new
issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.

This issue has been cloned to GitHub and is available here:
https://github.com/dogtagpki/pki/issues/3091

If you want to receive further updates on the issue, please navigate to the
GitHub issue and click on Subscribe button.

Thank you for understanding, and we apologize for any inconvenience.

Login to comment on this ticket.

Metadata