#33 CI test results not showing up on Bodhi updates
Closed 6 years ago by mvadkert. Opened 6 years ago by tstellar.

A recent Bodhi update for the clang package does not show the results of CI tests:
https://bodhi.fedoraproject.org/updates/FEDORA-2019-fcfaa6d2e1

In the "Automated Tests" tab, the org.centos.prod.ci.pipeline.allpackages-build.* are no longer listed as they were in previous updates. Here is an older update showing the test results:
https://bodhi.fedoraproject.org/updates/FEDORA-2019-5065cb8af8


@bgoncalv were there some messaging changes in Fedora? or is is just proposed?

I'm not aware of any format change. For this build we sent the topics like below:

org.centos.prod.ci.pipeline.allpackages-build.package.test.functional.complete is https://apps.fedoraproject.org/datagrepper/id?id=2019-db282702-8263-4e6b-ae9c-5b20c6347b81&size=extra-large

org.centos.prod.ci.pipeline.allpackages-build.complete is https://apps.fedoraproject.org/datagrepper/id?id=2019-1c98bd22-1a49-498b-9628-ebf974fa411f&size=extra-large

if nothing change, the only thing that could be blocking results gatting into resultsdb is the upstream resultsdb udpater.

@pingou hi, could anybody check this please? Seems resutlsdb fedmsg consumer did not pick these messages up ....

Ok putting here my findings:

  • Check the resultsdb box, the rdbsync is running but http://resultsdb.ci.centos.org/resultsdb_api/api/v2.0/ returns a 500 error
  • Check the error_log on that box, seems to be failing to import the python module resultsdb
  • Running the ansible ci.yml playbook, let's see if that fixes things

ok, seems we are in lack of some monitoring, we should maybe duplicate our promtheus/altermanager setup to Fedora CI openshift so we can catch these things early

Well...

# rpm -ql resultsdb-2.1.2-1.fc28.noarch
/etc/resultsdb
/etc/resultsdb/settings.py
/usr/bin/resultsdb
/usr/lib/python3.6/site-packages/resultsdb
....

Python3, I guess that would explain this :(

Nagios should definitely have caught this.

I've downgraded to resultsdb-2.1.0-1.fc28.noarch (lovely change from 2.1.0 to 2.1.2...) and http://resultsdb.ci.centos.org/resultsdb_api/api/v2.0/ is back

Let's see if that helps

I can see that the sync are passing again.

However, I fear we may have lost the messages during the outage :(
(happened yesterday, there was a mass-update and reboot of the entire infra which likely just upgrade resultsdb then - yum history show the upgrade on Feb 26th at 07:09PM UTC).

Here are the uptime info:

[root@ci-cc-rdu01 ~][PROD]# uptime 
 09:02:41 up 1 day, 12:04,  1 user,  load average: 0.03, 0.16, 0.21
[root@ci-cc-rdu01 ~][PROD]# date
Thu Feb 28 09:03:05 UTC 2019

resultsdb is not expected to be used as a library (there is resultsdb_api to be imported). But it was clearly my mistake, to change it from Python 2 to Python 3 only in Fedora stable release.

So, @pingou , I can change resultsdb package in fc28 to be dual py2/py3 , with /usr/bin/resultsdb being Python 3. Is it going to help you? Or do you prefer something else?

EDIT: Or, you might be running resultsdb as wsgi, which means you would need different .so for httpd to run it as Python 3.

@frantisekz
It's the wsgi that is being used, this is the exact error:

  File "/usr/share/resultsdb/resultsdb.wsgi", line 15, in <module>
    from resultsdb import app as application
ImportError: No module named resultsdb

So it's not imported as a library but still failed :)

For now the service is back up, I'll follow which ever solution you recommend.

What we did, when switching Taskotron from Py2 to Py3:

  • Install python3-mod_wsgi
  • Change /etc/httpd/conf.modules.d/10-wsgi.conf to load modules/mod_wsgi_python3.so instead of mod_wsgi.so

However, it seems apache can't use both Py2 and Py3 wsgi at the same time, so all the apps that run on the server through wsgi need to be either Py2 or Py3.

Is this going to work for you?

Hm, there seems to be at least another issue with the messages sent by the CI pipeline:

Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: [2019-02-28 10:10:31][fedmsg.consumers   ERROR] {'body': {u'certificate': u'LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUVPakNDQTZPZ0F3SUJBZ0lDQW5Fd0RRWUpL\nb1pJaHZjTkFRRUZCUUF3Z2FBeEN6QUpCZ05WQkFZVEFsVlQKTVFzd0NRWURWUVFJRXdKT1F6RVFN\nQTRHQTFVRUJ4TUhVbUZzWldsbmFERVhNQlVHQTFVRUNoTU9SbVZrYjNKaApJRkJ5YjJwbFkzUXhE\nekFOQmdOVkJBc1RCbVpsWkcxelp6RVBNQTBHQTFVRUF4TUdabVZrYlhObk1ROHdEUVlEClZRUXBF\nd1ptWldSdGMyY3hKakFrQmdrcWhraUc5dzBCQ1FFV0YyRmtiV2x1UUdabFpHOXlZWEJ5YjJwbFkz\nUXUKYjNKbk1CNFhEVEUzTURVeE1ERTBNamMwT0ZvWERUSTNNRFV3T0RFME1qYzBPRm93Z2NneEN6\nQUpCZ05WQkFZVApBbFZUTVFzd0NRWURWUVFJRXdKT1F6RVFNQTRHQTFVRUJ4TUhVbUZzWldsbmFE\nRVhNQlVHQTFVRUNoTU9SbVZrCmIzSmhJRkJ5YjJwbFkzUXhEekFOQmdOVkJBc1RCbVpsWkcxelp6\nRWpNQ0VHQTFVRUF4TWFabVZrYlhObkxYSmwKYkdGNUxtTnBMbU5sYm5SdmN5NXZjbWN4SXpBaEJn\nTlZCQ2tUR21abFpHMXpaeTF5Wld4aGVTNWphUzVqWlc1MApiM011YjNKbk1TWXdKQVlKS29aSWh2\nY05BUWtCRmhkaFpHMXBia0JtWldSdmNtRndjbTlxWldOMExtOXlaekNCCm56QU5CZ2txaGtpRzl3\nMEJBUUVGQUFPQmpRQXdnWWtDZ1lFQTJzYUJuSjNyTlhYQXV3Skt2UkJyQnJTYUdMWHgKYXg4VGhu\nZ0wxV2hCYS8wSFZVdVAxWEhWUEVweUh6YXZZK0dsRzFVclVUMkFMQzFuRk5nVUNpSjhWWWVoZElw\nWApzQzNiOHFnUmltekt0aHUxM2hqQ01kSTYzV3h1S3FBQk5UQTRkZWtBK1c2cE9EdVdIMEI1b0tq\nVjFmWkZRN2xFCjUzZlQybElBZWg4ZndZY0NBd0VBQWFPQ0FWY3dnZ0ZUTUFrR0ExVWRFd1FDTUFB\nd0xRWUpZSVpJQVliNFFnRU4KQkNBV0hrVmhjM2t0VWxOQklFZGxibVZ5WVhSbFpDQkRaWEowYVda\ncFkyRjBaVEFkQmdOVkhRNEVGZ1FVUytnVApwNmg2ZXpJZW5RK0lLUERnWmZWZHQ5a3dnZFVHQTFV\nZEl3U0J6VENCeW9BVWEwQmErUklJaVZubldlVUY5UUlkCkNrNS9GQUNoZ2Fha2dhTXdnYUF4Q3pB\nSkJnTlZCQVlUQWxWVE1Rc3dDUVlEVlFRSUV3Sk9RekVRTUE0R0ExVUUKQnhNSFVtRnNaV2xuYURF\nWE1CVUdBMVVFQ2hNT1JtVmtiM0poSUZCeWIycGxZM1F4RHpBTkJnTlZCQXNUQm1abApaRzF6WnpF\nUE1BMEdBMVVFQXhNR1ptVmtiWE5uTVE4d0RRWURWUVFwRXdabVpXUnRjMmN4SmpBa0Jna3Foa2lH\nCjl3MEJDUUVXRjJGa2JXbHVRR1psWkc5eVlYQnliMnBsWTNRdWIzSm5nZ2tBNDFBZVIwOFhIa1V3\nRXdZRFZSMGwKQkF3d0NnWUlLd1lCQlFVSEF3SXdDd1lEVlIwUEJBUURBZ2VBTUEwR0NTcUdTSWIz\nRFFFQkJRVUFBNEdCQUF5cApCUk43VXFaUU1vcUw3UkFnS09hMzFSVTh3R3lWaEJhd1NvZm1Qd1dT\nMUdEbVA1OU9FbElaRldrVisrTi92VXBSCmFjalFyTStoUEVEYXRaUVU5cEtiV3FmVy92WVVyaGpE\nYTNYV3dxeW1kT2hjWTFhWUR3aVE5NGlWekNGUkdFM2kKMXNkN2tuc2VjL2x4Z2NldmhYS2ZleTNK\nN241cXludFBYVGpVMjdGMQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==\n', u'timestamp': 0, u'msg_id': u'2019-5507dfed-8d99-4472-9a8f-4344421ad109', u'crypto': u'x509', u'topic': u'org.centos.prod.ci.pipeline.allpackages-build.package.ignored', u'signature': u'cEh/Qb6vrizZIX3xRH00RXR53L/7IA00lJYOap5lNngisr2/a/bbI+No/L6Gi1+rBfGFZdzevJmh\nrR14iAPWKZSrqvgZSOG28MwPNZQFKsIRyRFxA0Aqzf7yocKzjTgDpNg8SBK9KIhPc5fUkS/YJsIv\n5q8Eh629SVqMIlTxXcY=\n', u'msg': {u'build_id': u'235117', u'status': u'SUCCESS', u'comment_id': None, u'nvr': u'', u'repo': u'gdcm', u'namespace': None, u'build_url': u'https://jenkins-continuous-infra.apps.ci.centos.org/blue/organizations/jenkins/fedora-build-pipeline-trigger/detail/fedora-build-pipeline-trigger/235117/pipeline/', u'rev': u'kojitask-33100917', u'username': u'ankursinha', u'original_spec_nvr': u'', u'test_guidance': u"''", u'branch': u'f30', u'scratch': False, u'ref': u'x86_64'}}, 'topic': u'org.centos.prod.ci.pipeline.allpackages-build.package.ignored'}
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: Traceback (most recent call last):
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]:   File "/usr/lib/python2.7/site-packages/moksha/hub/api/consumer.py", line 207, in _do_work
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]:     self.consume(message)
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]:   File "/usr/lib/python2.7/site-packages/resultsdb_listener/pipeline_consumer.py", line 39, in consume
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]:     msg['msg']['topic'], msg['msg_id'])
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: KeyError: 'topic'

Did the format change?

However, it seems apache can't use both Py2 and Py3 wsgi at the same time, so all the apps that run on the server through wsgi need to be either Py2 or Py3.

I believe we also have execdb on this host, but iirc that's about it, so may be fixable

Did the format change?

oops, the topic was removed from the message, I think that's the only change that happened... :S

https://github.com/CentOS-PaaS-SIG/ci-pipeline/commit/3b7ce4ae92ebc2cec7ba0b59e0605bde85e66691#diff-f0d5bb4c686ce56b1193a83a81eb0189

oops, the topic was removed from the message, I think that's the only change that happened... :S
https://github.com/CentOS-PaaS-SIG/ci-pipeline/commit/3b7ce4ae92ebc2cec7ba0b59e0605bde85e66691#diff-f0d5bb4c686ce56b1193a83a81eb0189

API breakage :)

However, it seems apache can't use both Py2 and Py3 wsgi at the same time, so all the apps that run on the server through wsgi need to be either Py2 or Py3.

I believe we also have execdb on this host, but iirc that's about it, so may be fixable

Execdb is Python 3 only from F29, I can switch in F28 too, or build it as both Py2/Py3 there.

So, @pingou , this should work: https://bodhi.fedoraproject.org/updates/execdb-0.1.0-2.fc28

It's dual Py2/Py3, so it shouldn't break anything :)

Can you try it and give it +1 if it's working? Thanks!

Can you try it and give it +1 if it's working?

Can do but not before next week I think (I'd rather not touch this on a Friday and I won't be able to today) :)

We can roll this update out on Monday. What timeframe in UTC will work for people?

We can roll this update out on Monday. What timeframe in UTC will work for people?

It'll be the two of us, let's take an hour or so around 13:00UTC

@frantisekz we've tried to upgrade today but it seems resultsdb is python3 while resultsdb_frontend if python2. Did we miss something?

@pingou Yeah, my bad, resultsdb_frontend with Python 3 support is now pending testing: https://bodhi.fedoraproject.org/updates/resultsdb_frontend-2.1.1-1.fc28

Sorry!

Alright, I think we've got it this time:

# rpm -q resultsdb resultsdb_frontend execdb
resultsdb-2.1.2-1.fc28.noarch
resultsdb_frontend-2.1.1-1.fc28.noarch
execdb-0.1.0-2.fc28.noarch

And curl http://resultsdb.ci.centos.org/resultsdb/results doesn't return a 500 :)

I can confirm this is fixed now.

Closing per comment above

Metadata Update from @mvadkert:
- Issue status updated to: Closed (was: Open)

6 years ago

Log in to comment on this ticket.

Metadata