#875 hub plugins appear to break hub on code updates
Closed: Fixed 5 years ago by mikem. Opened 6 years ago by mikem.

In the past, we were able to apply hub changes with little or no downtime.

  • upgrade koji packages
  • restart httpd

Of late, we have been seeing errors pop up as soon as the packages update. They look like this:

2018-04-04 15:00:49,166 [ERROR] m=None u=None p=97080 r=?:? koji.plugins: Traceback (most recent call last):
   File "/usr/share/koji-hub/kojixmlrpc.py", line 496, in load_plugins
     tracker.load(name)
   File "/usr/lib/python2.6/site-packages/koji/plugin.py", line 75, in load
     raise koji.PluginError('module name conflict: %s' % mod_name)
 PluginError: module name conflict: _koji_plugin__runroot_hub

With similar errors for other configured hub plugins.

The problem goes away after an httpd restart. We only load plugins on the first call for each http process, but I guess maybe something is persisting in the module namespace somehow? Or our firstcall check is flawed?


Also, these errors tend to be preceded by some messy logger error noise, which is probably a separate bug:

Traceback (most recent call last):
  File "/usr/lib64/python2.6/logging/__init__.py", line 784, in emit
    msg = self.format(record)
  File "/usr/lib64/python2.6/logging/__init__.py", line 662, in format
    return fmt.format(record)
  File "/usr/share/koji-hub/kojixmlrpc.py", line 586, in format
    return logging.Formatter.format(self, record)
AttributeError: 'NoneType' object has no attribute 'Formatter'

I've just hit the same traceback ("PluginError: module name conflict: _koji_plugin__runroot_hub") on a fresh hub install with koji-hub-1.15.1-1.fc28.noarch. No packages were updated, hub and plugin config files were in place before httpd was started.

This also happens to me when I simultaneously start >10 kojid deamons and they connect to the hub at once. This puts the hub in ServerOffline state with "configuration error" message. Restarting httpd fixes it.

There may be more than one factor that can cause this.

Currently, I am able to reproduce this reliably on Fedora 28 with the following steps:

  • restart httpd
  • verify hub is working with a few test calls (e.g. koji --noauth call echo ok)
  • touch /usr/share/koji-hub/kojixmlrpc.py
  • run a few more test calls. Reliably hits within three

In my case, the issue seems to be a combination of:

  • mod_wsgi automatic script reloading
  • httpd using threads

I can't seem to replicate this on rhel6, even though mod_wsgi is also auto-reloading there. It must either be the lack of threads (rhel6 httpd is prefork), or a different reload behavior. Otoh, trying this on rhel6 does produce a bunch of errors in the logging module (AttributeError: 'NoneType' object has no attribute 'Formatter').

@mikem If we switched Koji to use gunicorn or uwsgi instead of mod_wsgi, would the problem go away?

I'll file a PR shortly with changes that fix this for me.

You may also be able to simply set WSGIScriptReloading Off for koji-hub in the httpd config.

Metadata Update from @tkopecek:
- Issue set to the milestone: 1.17

5 years ago

Login to comment on this ticket.

Metadata