A recent Bodhi update for the clang package does not show the results of CI tests: https://bodhi.fedoraproject.org/updates/FEDORA-2019-fcfaa6d2e1
In the "Automated Tests" tab, the org.centos.prod.ci.pipeline.allpackages-build.* are no longer listed as they were in previous updates. Here is an older update showing the test results: https://bodhi.fedoraproject.org/updates/FEDORA-2019-5065cb8af8
@bgoncalv were there some messaging changes in Fedora? or is is just proposed?
I'm not aware of any format change. For this build we sent the topics like below:
org.centos.prod.ci.pipeline.allpackages-build.package.test.functional.complete is https://apps.fedoraproject.org/datagrepper/id?id=2019-db282702-8263-4e6b-ae9c-5b20c6347b81&size=extra-large
org.centos.prod.ci.pipeline.allpackages-build.package.test.functional.complete
org.centos.prod.ci.pipeline.allpackages-build.complete is https://apps.fedoraproject.org/datagrepper/id?id=2019-1c98bd22-1a49-498b-9628-ebf974fa411f&size=extra-large
org.centos.prod.ci.pipeline.allpackages-build.complete
if nothing change, the only thing that could be blocking results gatting into resultsdb is the upstream resultsdb udpater.
@pingou hi, could anybody check this please? Seems resutlsdb fedmsg consumer did not pick these messages up ....
Ok putting here my findings:
ok, seems we are in lack of some monitoring, we should maybe duplicate our promtheus/altermanager setup to Fedora CI openshift so we can catch these things early
Well...
# rpm -ql resultsdb-2.1.2-1.fc28.noarch /etc/resultsdb /etc/resultsdb/settings.py /usr/bin/resultsdb /usr/lib/python3.6/site-packages/resultsdb ....
Python3, I guess that would explain this :(
Nagios should definitely have caught this.
I've downgraded to resultsdb-2.1.0-1.fc28.noarch (lovely change from 2.1.0 to 2.1.2...) and http://resultsdb.ci.centos.org/resultsdb_api/api/v2.0/ is back
resultsdb-2.1.0-1.fc28.noarch
Let's see if that helps
I can see that the sync are passing again.
However, I fear we may have lost the messages during the outage :( (happened yesterday, there was a mass-update and reboot of the entire infra which likely just upgrade resultsdb then - yum history show the upgrade on Feb 26th at 07:09PM UTC).
Here are the uptime info:
[root@ci-cc-rdu01 ~][PROD]# uptime 09:02:41 up 1 day, 12:04, 1 user, load average: 0.03, 0.16, 0.21 [root@ci-cc-rdu01 ~][PROD]# date Thu Feb 28 09:03:05 UTC 2019
resultsdb is not expected to be used as a library (there is resultsdb_api to be imported). But it was clearly my mistake, to change it from Python 2 to Python 3 only in Fedora stable release.
So, @pingou , I can change resultsdb package in fc28 to be dual py2/py3 , with /usr/bin/resultsdb being Python 3. Is it going to help you? Or do you prefer something else?
EDIT: Or, you might be running resultsdb as wsgi, which means you would need different .so for httpd to run it as Python 3.
@frantisekz It's the wsgi that is being used, this is the exact error:
File "/usr/share/resultsdb/resultsdb.wsgi", line 15, in <module> from resultsdb import app as application ImportError: No module named resultsdb
So it's not imported as a library but still failed :)
For now the service is back up, I'll follow which ever solution you recommend.
What we did, when switching Taskotron from Py2 to Py3:
However, it seems apache can't use both Py2 and Py3 wsgi at the same time, so all the apps that run on the server through wsgi need to be either Py2 or Py3.
Is this going to work for you?
Hm, there seems to be at least another issue with the messages sent by the CI pipeline:
Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: [2019-02-28 10:10:31][fedmsg.consumers ERROR] {'body': {u'certificate': u'LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUVPakNDQTZPZ0F3SUJBZ0lDQW5Fd0RRWUpL\nb1pJaHZjTkFRRUZCUUF3Z2FBeEN6QUpCZ05WQkFZVEFsVlQKTVFzd0NRWURWUVFJRXdKT1F6RVFN\nQTRHQTFVRUJ4TUhVbUZzWldsbmFERVhNQlVHQTFVRUNoTU9SbVZrYjNKaApJRkJ5YjJwbFkzUXhE\nekFOQmdOVkJBc1RCbVpsWkcxelp6RVBNQTBHQTFVRUF4TUdabVZrYlhObk1ROHdEUVlEClZRUXBF\nd1ptWldSdGMyY3hKakFrQmdrcWhraUc5dzBCQ1FFV0YyRmtiV2x1UUdabFpHOXlZWEJ5YjJwbFkz\nUXUKYjNKbk1CNFhEVEUzTURVeE1ERTBNamMwT0ZvWERUSTNNRFV3T0RFME1qYzBPRm93Z2NneEN6\nQUpCZ05WQkFZVApBbFZUTVFzd0NRWURWUVFJRXdKT1F6RVFNQTRHQTFVRUJ4TUhVbUZzWldsbmFE\nRVhNQlVHQTFVRUNoTU9SbVZrCmIzSmhJRkJ5YjJwbFkzUXhEekFOQmdOVkJBc1RCbVpsWkcxelp6\nRWpNQ0VHQTFVRUF4TWFabVZrYlhObkxYSmwKYkdGNUxtTnBMbU5sYm5SdmN5NXZjbWN4SXpBaEJn\nTlZCQ2tUR21abFpHMXpaeTF5Wld4aGVTNWphUzVqWlc1MApiM011YjNKbk1TWXdKQVlKS29aSWh2\nY05BUWtCRmhkaFpHMXBia0JtWldSdmNtRndjbTlxWldOMExtOXlaekNCCm56QU5CZ2txaGtpRzl3\nMEJBUUVGQUFPQmpRQXdnWWtDZ1lFQTJzYUJuSjNyTlhYQXV3Skt2UkJyQnJTYUdMWHgKYXg4VGhu\nZ0wxV2hCYS8wSFZVdVAxWEhWUEVweUh6YXZZK0dsRzFVclVUMkFMQzFuRk5nVUNpSjhWWWVoZElw\nWApzQzNiOHFnUmltekt0aHUxM2hqQ01kSTYzV3h1S3FBQk5UQTRkZWtBK1c2cE9EdVdIMEI1b0tq\nVjFmWkZRN2xFCjUzZlQybElBZWg4ZndZY0NBd0VBQWFPQ0FWY3dnZ0ZUTUFrR0ExVWRFd1FDTUFB\nd0xRWUpZSVpJQVliNFFnRU4KQkNBV0hrVmhjM2t0VWxOQklFZGxibVZ5WVhSbFpDQkRaWEowYVda\ncFkyRjBaVEFkQmdOVkhRNEVGZ1FVUytnVApwNmg2ZXpJZW5RK0lLUERnWmZWZHQ5a3dnZFVHQTFV\nZEl3U0J6VENCeW9BVWEwQmErUklJaVZubldlVUY5UUlkCkNrNS9GQUNoZ2Fha2dhTXdnYUF4Q3pB\nSkJnTlZCQVlUQWxWVE1Rc3dDUVlEVlFRSUV3Sk9RekVRTUE0R0ExVUUKQnhNSFVtRnNaV2xuYURF\nWE1CVUdBMVVFQ2hNT1JtVmtiM0poSUZCeWIycGxZM1F4RHpBTkJnTlZCQXNUQm1abApaRzF6WnpF\nUE1BMEdBMVVFQXhNR1ptVmtiWE5uTVE4d0RRWURWUVFwRXdabVpXUnRjMmN4SmpBa0Jna3Foa2lH\nCjl3MEJDUUVXRjJGa2JXbHVRR1psWkc5eVlYQnliMnBsWTNRdWIzSm5nZ2tBNDFBZVIwOFhIa1V3\nRXdZRFZSMGwKQkF3d0NnWUlLd1lCQlFVSEF3SXdDd1lEVlIwUEJBUURBZ2VBTUEwR0NTcUdTSWIz\nRFFFQkJRVUFBNEdCQUF5cApCUk43VXFaUU1vcUw3UkFnS09hMzFSVTh3R3lWaEJhd1NvZm1Qd1dT\nMUdEbVA1OU9FbElaRldrVisrTi92VXBSCmFjalFyTStoUEVEYXRaUVU5cEtiV3FmVy92WVVyaGpE\nYTNYV3dxeW1kT2hjWTFhWUR3aVE5NGlWekNGUkdFM2kKMXNkN2tuc2VjL2x4Z2NldmhYS2ZleTNK\nN241cXludFBYVGpVMjdGMQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==\n', u'timestamp': 0, u'msg_id': u'2019-5507dfed-8d99-4472-9a8f-4344421ad109', u'crypto': u'x509', u'topic': u'org.centos.prod.ci.pipeline.allpackages-build.package.ignored', u'signature': u'cEh/Qb6vrizZIX3xRH00RXR53L/7IA00lJYOap5lNngisr2/a/bbI+No/L6Gi1+rBfGFZdzevJmh\nrR14iAPWKZSrqvgZSOG28MwPNZQFKsIRyRFxA0Aqzf7yocKzjTgDpNg8SBK9KIhPc5fUkS/YJsIv\n5q8Eh629SVqMIlTxXcY=\n', u'msg': {u'build_id': u'235117', u'status': u'SUCCESS', u'comment_id': None, u'nvr': u'', u'repo': u'gdcm', u'namespace': None, u'build_url': u'https://jenkins-continuous-infra.apps.ci.centos.org/blue/organizations/jenkins/fedora-build-pipeline-trigger/detail/fedora-build-pipeline-trigger/235117/pipeline/', u'rev': u'kojitask-33100917', u'username': u'ankursinha', u'original_spec_nvr': u'', u'test_guidance': u"''", u'branch': u'f30', u'scratch': False, u'ref': u'x86_64'}}, 'topic': u'org.centos.prod.ci.pipeline.allpackages-build.package.ignored'} Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: Traceback (most recent call last): Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: File "/usr/lib/python2.7/site-packages/moksha/hub/api/consumer.py", line 207, in _do_work Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: self.consume(message) Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: File "/usr/lib/python2.7/site-packages/resultsdb_listener/pipeline_consumer.py", line 39, in consume Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: msg['msg']['topic'], msg['msg_id']) Feb 28 10:10:31 ci-cc-rdu01.fedoraproject.org fedmsg-hub[14545]: KeyError: 'topic'
Did the format change?
I believe we also have execdb on this host, but iirc that's about it, so may be fixable
oops, the topic was removed from the message, I think that's the only change that happened... :S
https://github.com/CentOS-PaaS-SIG/ci-pipeline/commit/3b7ce4ae92ebc2cec7ba0b59e0605bde85e66691#diff-f0d5bb4c686ce56b1193a83a81eb0189
oops, the topic was removed from the message, I think that's the only change that happened... :S https://github.com/CentOS-PaaS-SIG/ci-pipeline/commit/3b7ce4ae92ebc2cec7ba0b59e0605bde85e66691#diff-f0d5bb4c686ce56b1193a83a81eb0189
API breakage :)
However, it seems apache can't use both Py2 and Py3 wsgi at the same time, so all the apps that run on the server through wsgi need to be either Py2 or Py3. I believe we also have execdb on this host, but iirc that's about it, so may be fixable
Execdb is Python 3 only from F29, I can switch in F28 too, or build it as both Py2/Py3 there.
So, @pingou , this should work: https://bodhi.fedoraproject.org/updates/execdb-0.1.0-2.fc28
It's dual Py2/Py3, so it shouldn't break anything :)
Can you try it and give it +1 if it's working? Thanks!
Can you try it and give it +1 if it's working?
Can do but not before next week I think (I'd rather not touch this on a Friday and I won't be able to today) :)
We can roll this update out on Monday. What timeframe in UTC will work for people?
It'll be the two of us, let's take an hour or so around 13:00UTC
@frantisekz we've tried to upgrade today but it seems resultsdb is python3 while resultsdb_frontend if python2. Did we miss something?
@pingou Yeah, my bad, resultsdb_frontend with Python 3 support is now pending testing: https://bodhi.fedoraproject.org/updates/resultsdb_frontend-2.1.1-1.fc28
Sorry!
Alright, I think we've got it this time:
# rpm -q resultsdb resultsdb_frontend execdb resultsdb-2.1.2-1.fc28.noarch resultsdb_frontend-2.1.1-1.fc28.noarch execdb-0.1.0-2.fc28.noarch
And curl http://resultsdb.ci.centos.org/resultsdb/results doesn't return a 500 :)
curl http://resultsdb.ci.centos.org/resultsdb/results
Should we close?
I can confirm this is fixed now.
Closing per comment above
Metadata Update from @mvadkert: - Issue status updated to: Closed (was: Open)
Log in to comment on this ticket.