#375 waiverdb missbehaving in staging fedora infra
Closed: Fixed 4 years ago by lholecek. Opened 4 years ago by pingou.

There are a few updates in bodhi.stg which are stuck because of failed tests (this is expected, nothing wrong), but when we waive the tests, bodhi complains as it gets a 500 error from waiverdb.

Here are the logs on the bodhi side:

[Thu Feb 20 15:23:58.929868 2020] [wsgi:error] [pid 16:tid 140667468596992] [client 10.130.2.1:59638] 2020-02-20 15:23:58,929 DEBUG [bodhi.server][Dummy-1] Attempting to waive test results for this update FEDORA-2020-a16f2aabc3
[Thu Feb 20 15:23:59.142326 2020] [wsgi:error] [pid 16:tid 140667468596992] [client 10.130.2.1:59638] 2020-02-20 15:23:59,142 DEBUG [bodhi.server][Dummy-1] Waiving test results: {'subject': {'item': 'dummy-test-package-gloster-0-71.fc32', 'type': 'koji_build'}, 'testcase': 'fedora-ci.koji-build.tier0.functional', 'product_version': 'fedora-32', 'waived': True, 'username': 'packagerbot', 'comment': "'This is fine, we are testing the workflow'"}
10.128.2.1 - - [20/Feb/2020:15:23:59 +0000] "GET /updates/FEDORA-2017-677fc2e299 HTTP/1.1" 200 47899 "-" "Mozilla/5.0 (compatible; SemrushBot/6~bl; +http://www.semrush.com/bot.html )"
[Thu Feb 20 15:23:59.305478 2020] [wsgi:error] [pid 16:tid 140667468596992] [client 10.130.2.1:59638] 2020-02-20 15:23:59,305 DEBUG [bodhi.server][Dummy-1] {"message": "Internal Server Error"}
[Thu Feb 20 15:23:59.305512 2020] [wsgi:error] [pid 16:tid 140667468596992] [client 10.130.2.1:59638] 
[Thu Feb 20 15:23:59.305622 2020] [wsgi:error] [pid 16:tid 140667468596992] [client 10.130.2.1:59638] 2020-02-20 15:23:59,305 ERROR [bodhi.server][Dummy-1] Bodhi failed to send POST request to WaiverDB at the following URL "https://waiverdb-web-waiverdb.app.os.stg.fedoraproject.org/api/v1.0/waivers/ ". The status code was "500".

The {"message": "Internal Server Error"} is the content that is returned by waiverdb to bodhi.

I quickly looked at the logs of the waiverdb pods and could not see any stacktrace or so.

Could you have a look at this?

Thanks :)


Any chance you can share the request payload?

I think the payload is in the log; here is it reformatted:

{
  "subject": {
    "item": "dummy-test-package-gloster-0-71.fc32",
    "type": "koji_build"
  },
  "testcase": "fedora-ci.koji-build.tier0.functional",
  "product_version": "fedora-32",
  "waived": true,
  "username": "packagerbot",
  "comment": "'This is fine, we are testing the workflow'"
}

Have you had a chance to look into this? Any idea what is going on?

Is there a command I can reproduce it with?

In local dev env I can create the waiver but for stage I end up with HTTP 400:

{"message": "The browser (or proxy) sent a request that this server could not understand."}

(After getting OIDC token I sent a POST request to https://waiverdb-web-waiverdb.app.os.stg.fedoraproject.org/api/v1.0/waivers/ with the data and auth header Authorization: Bearer XYZ.)

Ah, never mind, I must have been using wrong curl command. Reproduced with httpie:

echo '{ "subject": { "item": "dummy-test-package-gloster-0-71.fc32", "type": "koji_build" }, "testcase": "fedora-ci.koji-build.tier0.functional", "product_version": "fedora-32", "waived": true, "comment": "This is fine, we are testing the workflow" }'|http -F 'https://waiverdb-web-waiverdb.app.os.stg.fedoraproject.org/api/v1.0/waivers/' "Authorization:Bearer $TOKEN"                           

Would it be possible to get access to waiverdb logs on os.stg.fedoraproject.org? (I already have access to prod logs.)

Hmm, since the waiver is actually created before server returns 500, my guess is that it has something to do with publishing on message bus.

Try setting MESSAGE_BUS_PUBLISH = False in configuration for waiverdb stage instance. The message publisher is set to fedmsg by default and it's not yet implemented for fedora-messaging.

Would it be possible to get access to waiverdb logs on os.stg.fedoraproject.org? (I already have access to prod logs.)

I'm surprised you had access to prod since you were not in the owner list, but
anyway, you are now and thus should have access to the logs in both stg and prod

I'm surprised you had access to prod since you were not in the owner list

Hmm, maybe it was different project.

The message publisher is set to fedmsg by default and it's not yet implemented for fedora-messaging.

My mistake, "fedmsg" in configuration now uses fedora-messaging.

The issue is fixed now. It was caused by recent rabbitmq cluster update in stage. Running playbooks for waiverdb stage resolved the issue (a user for rabbitmq had to be recreated).

Metadata Update from @lholecek:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

4 years ago

Login to comment on this ticket.

Metadata