Issue #1103: CentOS CI's OCP keeps reseting Jenkins admin e-mail address - centos-infra

centos-infra

#1103 CentOS CI's OCP keeps reseting Jenkins admin e-mail address

Closed: Fixed with Explanation a year ago by mrc0mmand. Opened a year ago by mrc0mmand.

Hey,

similarly to #1091, after yesterday's OCP upgrade the e-mail notifications from our Jenkins instance stopped working once again due to the same issue - the admin e-mail address got reset to the default (unset) value. Given that this setting resides in a separate file together with Jenkins location [0] I wonder if there's some hidden config map or so that keeps resetting this file to some pre-configured default - like it's done with the cico-workspace pod template. I went through everything I have access to in the OCP console, but couldn't find anything - maybe it's something I don't have access to?

[0]

$ cat /var/lib/jenkins/jenkins.model.JenkinsLocationConfiguration.xml
<?xml version='1.1' encoding='UTF-8'?>
<jenkins.model.JenkinsLocationConfiguration>
  <adminAddress>builder@jenkins-systemd.apps.ocp.cloud.ci.centos.org</adminAddress>
  <jenkinsUrl>https://jenkins-systemd.apps.ocp.cloud.ci.centos.org/</jenkinsUrl>
</jenkins.model.JenkinsLocationConfiguration>

Metadata Update from @arrfab:
- Issue assigned to dkirwan
- Issue tagged with: centos-ci-infra, medium-gain, medium-trouble

a year ago

dkirwan commented a year ago

Hi @mrc0mmand I've been looking at this here can't figure it out either.

I saw this issue: https://issues.jenkins.io/browse/JENKINS-55088?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel

They mention a "configuration as code" plugin, which would prevent any changes being made via the GUI. Is that were you originally set the system admin email address? Are you making use of this configuration as code plugin to configure your jenkins instance? Is that email address being set there?

mrc0mmand commented a year ago

Nope, I've never used the "configuration as code", always set it manually in the GUI, and the respective config file seems to agree:

$ cat io.jenkins.plugins.casc.CasCGlobalConfig.xml
<?xml version='1.1' encoding='UTF-8'?>
<io.jenkins.plugins.casc.CasCGlobalConfig plugin="configuration-as-code@1512.vb_79d418d5fc8">
  <configurationPath></configurationPath>
</io.jenkins.plugins.casc.CasCGlobalConfig>

However, during the past couple of hours four different deployments [0] were triggered (no idea why, no maintenance was announced). I changed the email address back after each deployment, and it seems to have survived the last one. Not sure if by a sheer luck or it just works now™.

[0]

jenkins-3-deploy       0/1     Completed   0          160m
jenkins-4-deploy       0/1     Completed   0          154m
jenkins-5-deploy       0/1     Completed   0          135m
jenkins-6-deploy       0/1     Completed   0          127m
jenkins-6-r7mmk        1/1     Running     0          127m

dkirwan commented a year ago

The deployments thats me, redeploying and examining the configuration inside the PV, I restored the email via the GUI finally before contacting you there.

I've only one idea so far, can you try removing that configuration as code plugin if you are not making use of it? And retesting? From what I was reading that plugin might prevent this configuration being persisted correctly.

mrc0mmand commented a year ago

The deployments thats me, redeploying and examining the configuration inside the PV, I restored the email via the GUI finally before contacting you there.

Ah, I see, I was worried something is going really wrong with the instance.

I've only one idea so far, can you try removing that configuration as code plugin if you are not making use of it? And retesting? From what I was reading that plugin might prevent this configuration being persisted correctly.

I disabled and uninstalled the plugin completely. After Jenkins restart the email address persists, but after doing a rollout in the OCP console it resets back to default.

mrc0mmand commented a year ago

Could you spawn a clean new project in the OCP cluster (either dev or prod) to see if it happens there as well? I wonder if it's the default behavior, or there's something wrong with this particular instance.

dkirwan commented a year ago

Interesting ok so this plugin is in there by default in the image I guess ;/ I don't expect a fresh instance would resolve, it'll just get reinstalled with the new deployment.

I think we need to do some research regarding this plugin.. better to work with it rather than fight it. We need to figure out how to make this admin email change via the plugin so it handles the persistence of the config.

[1] https://plugins.jenkins.io/configuration-as-code/

I see in [1] it mentions a jenkins.yaml in $JENKINS_HOME/jenkins.yaml, id say we need to make a change there regarding the admin email if we want it to actually persist, via this plugin.

dkirwan commented a year ago

https://github.com/jenkinsci/configuration-as-code-plugin/blob/a6983ff60e0cf198ce02d7992bcba927197174db/test-harness/src/test/resources/io/jenkins/plugins/casc/GetConfiguratorsTest.yml#L28-L30

Example setting the location adminAddress

mrc0mmand commented a year ago

Interesting ok so this plugin is in there by default in the image I guess ;/ I don't expect a fresh instance would resolve, it'll just get reinstalled with the new deployment.

I think we need to do some research regarding this plugin.. better to work with it rather than fight it. We need to figure out how to make this admin email change via the plugin so it handles the persistence of the config.

[1] https://plugins.jenkins.io/configuration-as-code/

I see in [1] it mentions a jenkins.yaml in $JENKINS_HOME/jenkins.yaml, id say we need to make a change there regarding the admin email if we want it to actually persist, via this plugin.

I'm a bit confused - why would the plugin be at fault here, when the issue is present even if it's completely uninstalled?

(Apologies for the late response, a lot of stuff has piled on)

dkirwan commented a year ago

Ah I might have misunderstood your message:

I disabled and uninstalled the plugin completely. After Jenkins restart the email address persists, but after doing a rollout in the OCP console it resets back to default.

Maybe try reinstall that plugin, and then try configure it to set the admin address using this jenkins.yaml in /var/lib/jenkins. See example here:https://github.com/jenkinsci/configuration-as-code-plugin/blob/a6983ff60e0cf198ce02d7992bcba927197174db/test-harness/src/test/resources/io/jenkins/plugins/casc/GetConfiguratorsTest.yml#L28-L30

Alternatively it seems its possible to configure it on Jenkins init using a groovy script like:

import jenkins.model.JenkinsLocationConfiguration
def jlc = jenkins.model.JenkinsLocationConfiguration.get()
jlc.setUrl('https://jenkins-systemd.apps.ocp.cloud.ci.centos.org/')
jlc.setAdminAddress('builder@jenkins-systemd.apps.ocp.cloud.ci.centos.org')

For API docs: https://javadoc.jenkins.io/jenkins/model/JenkinsLocationConfiguration.html

mrc0mmand commented a year ago

I guess that would work, but it feels like a workaround than a proper solution. I'm curious why this happens now, in the new cluster, as it worked fine in the old one.

I looked around a bit and found this: https://github.com/openshift/jenkins/commit/cd3b05f37ad97afb8463f08ed3c916217c60ef2a which looks relevant, but also confuses me even more.

mrc0mmand commented a year ago

And by looking at the Jenkins pod logs, it might be the root cause:

Migrating slave image configuration to current version tag ...
Using JENKINS_SERVICE_NAME=jenkins
Generating jenkins.model.JenkinsLocationConfiguration.xml using (/var/lib/jenkins/jenkins.model.JenkinsLocationConfiguration.xml.tpl) ...
Jenkins URL set to: https://jenkins-systemd.apps.ocp.cloud.ci.centos.org in file: /var/lib/jenkins/jenkins.model.JenkinsLocationConfiguration.xml
OpenShift OAuth enabled: true

In that case I just wonder - w h y?

dkirwan commented a year ago

This Jenkins image is managed/updated by the Openshift team etc, this does look like a bug for sure might be worth opening a bug report: https://bugzilla.redhat.com/buglist.cgi?component=Jenkins&product=OpenShift%20Container%20Platform

There is a file /var/lib/jenkins/jenkins.model.JenkinsLocationConfiguration.xml.tpl I tried moving it, renaming it, and killing/forcing container restart and also rolling out latest deploymentconfig etc didn't seem to have any effect, but that was before you removed the configuration as code plugin.

That configuration as code plugin does seem to now be the preferred way to configure jenkins on Openshift, seems to offer a nice mechanism for persisting changes across reboots/container restarts, and you can do it all within the same storage PV. Whereas for example a groovy init script needs to be in /usr/share/jenkins/ref/init.groovy.d/ which is outside the persistent storage etc.

mrc0mmand commented a year ago

Yeah, I guess I'll look into JCasC after all. And let's close this ticket, since the issue is not on CentOS CI's side.

Thanks for the help!

Metadata Update from @mrc0mmand:
- Issue close_status updated to: Fixed with Explanation
- Issue status updated to: Closed (was: Open)

a year ago

Metadata

Assignee

dkirwan

Tags

Blocking

None

Depending on

None

Priority

Needs Review

centos-infra

Source Code

#1103 CentOS CI's OCP keeps reseting Jenkins admin e-mail address Closed: Fixed with Explanation a year ago by mrc0mmand. Opened a year ago by mrc0mmand.

Metadata

medium-trouble centos-ci-infra medium-gain

#1103 CentOS CI's OCP keeps reseting Jenkins admin e-mail address

Closed: Fixed with Explanation a year ago by mrc0mmand. Opened a year ago by mrc0mmand.