#8087 Update bugzilla2fedmsg in production
Closed: Fixed 17 days ago by kevin. Opened 2 years ago by adamwill.

production bugzilla2fedmsg is, AFAICT, still 0.3.0 or 0.3.1. This means it's still Python 2, still publishing to fedmsg not fedora-messaging, and has a bunch of bugs in terms of how it actually produces messages that are fixed in 1.0.0 (which we released yesterday). We should update to 1.0.0 in production so we are using Python 3 to publish awesome messages to fedora-messaging.

The Fedora package was retired, so we should revive it, I guess. The new version depends on stompest, which is not yet packaged, so it needs packaging (I see @abompard did a build of it in an infra tag, but it really should be packaged properly) first.

@abompard @jcline

I can try and get this later this week, or someone else can sooner. :)

Metadata Update from @kevin:
- Issue priority set to: Waiting on Assignee (was: Needs Review)

2 years ago

I did a quick run over the bugzilla2fedmsg spec file, but then ran into the missing python-stompest package. I didn't want to duplicate whatever work @abompard 's done on that, so I stopped.

Yeah I'm working on that, I've been trying to build it in the infra repo for a couple days but I'm hitting a buildroot error.

DEBUG util.py:585:  BUILDSTDERR: Error: 
DEBUG util.py:585:  BUILDSTDERR:  Problem: conflicting requests
DEBUG util.py:585:  BUILDSTDERR:   - nothing provides python-rpm-macros > 3-30 needed by python-devel-2.7.5-86.el7.x86_64
DEBUG util.py:585:  BUILDSTDERR:   - nothing provides python2-rpm-macros > 3-30 needed by python-devel-2.7.5-86.el7.x86_64

Do you know where that comes from?

This should be all fixed now. Please rebuild.

OK, here's some update: the newer version of bugzilla2fedmsg has been ported to Python 3 and Fedora Messaging. However, the fedora-messaging package is not built for Python3 in EPEL7, where bugzilla2fedmsg currently runs, because of an outdated dependency. As a result I can't update bugzilla2fedmsg on EPEL7.

I think it would be beneficial to move the service on a Fedora host, ideally running in Openshift. What do you think, folks?
Can you create the project for me in OS? I won't be able to work on it before next Wednesday though.

Metadata Update from @abompard:
- Issue assigned to abompard

2 years ago

+1 for running it on Fedora (or, I guess, EPEL 8? :>)

is the outdated dependency in EPEL or RHEL? If it's in EPEL I can maybe do something with provenpackager powers...

Moving this to openshift sounds great to me... you should be able to make a playbook and it will create the project, etc.

I can add whatever you want to call the playbook to rbac-playbook when you are ready

Metadata Update from @kevin:
- Issue tagged with: backlog

2 years ago

OK, I have created playbooks/openshift-apps/bugzilla2fedmsg.yml. Kevin, could you allow me to run rbac-playbook on it so I can test it?

rbac-playbook says user abompard is not authorized to run openshift-apps/bugzilla2fedmsg.yml. Should I do something @kevin ?

Sorry about that, there was a typo in the rbac config. ;(

Fixed now, please retry...

Oh, I thought I pinged you on IRC @adamwill, but I may have forgotten. The package had been orphaned for a while so the next step now is to re-review it:
If you want to help with that, it'd be appreciated. Thanks.

I put some comments on the bug six days ago :)

bugzilla2fedmsg and python-stompest passed their reviews and got approved.

https://bodhi.fedoraproject.org/updates/FEDORA-2019-da4b23fbea is pending for F31, once it goes stable I guess we can go ahead and try to deploy this.

That update is stable now, so nothing should be stopping someone from doing a deployment of the new version now!

Alright, I tried updating staging in our openshift instance, the code runs but I'm getting these messages:

Could not connect to messaging-devops-broker01.dist.stage.ext.phx2.redhat.com:61612 [Could not establish connection [[Errno 111] Connection refused]]

Is it the correct hostname / port? Do you know of anything that would be preventing a connection to the STOMP broker from our Openshift ?

Might be some firewall rule allowing the specific existing ips? But that seems odd.

Hum... the existing bugzill2fedmsg instance also cannot connect. I bet something is wrong with the internal staging? We can file a ticket...

Sure, I can file a ticket, but I don't know where. A pointer please? Thanks.

Filed the ticket and copied you on it so you can see where it should be filed and such moving forward.

Just an update. Internal ticket has been bouncing around without too much luck of late. Hopefully it's now around the right area to get sorted...

The service owner said it looks fine to him... then a networking person confirmed it's not reachable from any of the places we would reach it.

And it's stalled there again. ;( I'll try and see if I can get someone to take some action...

So, after 3.5 months and being bounced around serveral times... it now seems to be working from the old vm.

However, the openshift one does not seem to be working. ;(

Asking now if there's a specific IP they are allowing or what.

So, after 3.5 months and being bounced around serveral times... it now seems to be working from the old vm.
However, the openshift one does not seem to be working. ;(
Asking now if there's a specific IP they are allowing or what.

For some applications in OpenShift we have to define the bugzilla ip in the deployment config for example https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/roles/openshift-apps/the-new-hotness/files/deploymentconfig.yml#n26

Might be worth a try ?

no, it's not us going to the wrong ip on their end, it's that their end is firewalled it only allow specific ips to talk to it.

At this point I am thinking we should just wait and fix it in iad2, because we will have to get it working after we move anyhow...

but if it's more important that that, please shout out now and I can try and get them to add the openshift ougoing ips...

The main thing blocked on this for us (QA) is enhancing the blockerbugs app to work off messages instead of just refreshing everything every half hour. We don't really want to write that to the old, busted version of the messages, we just want to work with fixed messages. (We may actually need some of the fixes to do the job properly, I don't recall).

We will also get much better fedmsg2meta processing once the updated version is deployed.

yeah, ok.

I updated the internal ticket asking them to add the external ip's for the nodes in the staging cluster for now and asking about how to get things working in iad2. If we are luckly they can fix the stg ones soon...

Metadata Update from @cverna:
- Assignee reset

a year ago

Staging version doesn't currently work at this error:

Dec 10 22:13:25 bugzilla2fedmsg01.stg.iad2.fedoraproject.org moksha-hub[26087]: KeyError: 'topic_prefix_re'

going to need dev to look at so we can make it an ops problem again

Metadata Update from @smooge:
- Issue tagged with: dev, low-gain, medium-trouble

6 months ago

I've "fixed" the traceback by adding:

    # Prefix for the topic of each message sent.

to /etc/fedmsg.d/base.py

but I see now:

[  moksha.hub] ERROR 2020-12-15 16:25:25,600 Connection failed. Reason: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.TimeoutError'>: User timeout caused connection failure.
[  moksha.hub] INFO 2020-12-15 16:25:25,601 (failover) reconnecting in 0.400000 seconds.
[  moksha.hub] INFO 2020-12-15 16:25:26,001 connecting encrypted to '.....' <moksha.hub.stomp.stomp.StompHubExtension object at 0x7ff354419b50>
[  moksha.hub] WARNING 2020-12-15 16:25:26,001 Connecting without SNI support due to absence of service_identity module.

which makes me thing it fails to connect to the internal broker because it is missing something in its config?

which makes me thing it fails to connect to the internal broker because it is missing something in its config?

So the last line is likely not related, leaving the first line about a time out. Could it be that the stomp_uri needs to be changed? (I see it still points to something in phx2)

Looking at https://apps.fedoraproject.org/datagrepper/raw?rows_per_page=1&delta=127800&category=bugzilla it seems that we receive bugzilla messages on the bus now.

Does that mean this issue is fixed?

Metadata Update from @pingou:
- Issue priority set to: Next Meeting (was: Waiting on Assignee)

3 months ago

no, the issue was that staging wasn't working, so we couldn't test the openshift / newer / python3 version. The prod version still runs in a vm and is python2. ;(

no, the issue was that staging wasn't working, so we couldn't test the openshift / newer / python3 version. The prod version still runs in a vm and is python2. ;(

Ok, so I've re-deployed this in our staging openshift and the current issue is:

[stompest.sync.client WARNING] Could not connect to messaging-......redhat.com:61612 [Could not establish connection [[Errno 110] Connection timed out]]

I could ping that hostname fine.

Could it be a network issue?

Yes, it could be a networking/firewalling issue... prod works, but stg hasn't been.
So, I guess we need an internal ticket or a few minutes of matt's time to check it...

Staging now connects to the STOMP server! Nice! Thanks to everybody involved.

However it seems to be getting no messages, while prod does get messages. So there's probably still a missing piece. Maybe we should check with Red Hat that their prod messages are relayed on the stage STOMP instance?

I have sent in a ticket for bugzilla folks to look at the staging STOMP endpoint. Will let you know what I find out.

bugzilla admins say:

"Hi, the current stage isn't configured to use the UMB. We have new hosting coming and it will be set-up there when we deploy the new stage."

So, I guess lets just roll on to prod now?

I can totally deploy to prod, but we'll have to choose between two evils: having duplicate messages or shutting down the current bugzilla2fedmsg and hoping the new one will work. Any preferences? I'm thinking duplicate messages should not be too much of a problem, but I'm not sure.

As long as the overlap is short I think either is fine.

The updated bugzilla2fedmsg is now running in prod and apparently working. I've turned off the listener in the VM, but let's keep it around for a while in case something's wrong with the new version.

If you're using those messages, please check that what you're now getting is what you expect.

For the most part this has been working, so I'd say lets remove/retire the vm's next week?

I've removed the vm's and their config now.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

17 days ago

Login to comment on this ticket.

Boards 1
dev Status: Blocked