#10358 IPA backup fails on ipa03 and ipa02.stg
Opened 2 years ago by abompard. Modified 17 days ago

Going through my admin@fpo email I noticed an error from ipa03's backup script:

Error: Local roles CA do not match globally used roles CA, KRA. A backup done on this host would not be complete enough to restore a fully functional, identical cluster.
The ipa-backup command failed. See /var/log/ipabackup.log for more information

I've traced it to this patch of May 2020, which may have landed recently in an update.

Apparently the backup fails because some global roles are absent from the local instance, not sure exactly what it means, but indeed when I run:

ldapsearch -b cn=masters,cn=ipa,cn=etc,dc=fedoraproject,dc=org

I see the KRA role for ipa01, ipa02 but not ipa03. Do we want to somehow add this role to ipa03 or should we just not run the backup on ipa03?

(same thing happens in ipa02.stg with the DNS and DNSKeySync roles)

Metadata Update from @mohanboddu:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: medium-gain, medium-trouble, ops

2 years ago

Ah yes, the KRA problem. ;)

We orig had ipa01.phx2.fedoraproject.org and ipa02.phx2.fedoraproject.org and both were KRA enabled/clustered.

Then we added ipa01.iad2.fedoraproject.org and ipa02.iad2.fedoraproject.org and synced them, all 4 were KRA enabled.

Then we nuked ipa01.phx2 and ipa02.phx2 because we no longer were in that datacenter.

Then we added ipa03.iad2.fedoraproject.org, but I was completely unable to get it to sync KRA. It gave a bunch of weird errors and failed. I read somewhere that this was because the orig server that setup that function (ipa01.phx2) was gone and we can't add any more KRA replicants because of it. Not sure if thats really true, but sounds somewhat plausable.

We don't use KRA currently, so I went on to other things.

We can:

Just remove KRA / disable / drop it somehow from existing servers and drive on.
Try and figure out how to get it to add ipa03.iad and ipa02.stg into their KRA replication

KRA might be usefull/nice someday, but I don't know.

Here's what happens if you try and enable it:

[root@ipa03 ~][PROD-IAD2]# ipa-kra-install --no-host-dns                                            
Directory Manager password:                                                                         

/usr/lib/python3.6/site-packages/urllib3/connection.py:376: SubjectAltNameWarning: Certificate for ip
a03.iad2.fedoraproject.org has no `subjectAltName`, falling back to check for a `commonName` for now.
 This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/
shazow/urllib3/issues/497 for details.)                                                             
Lookup failed: Preferred host ipa03.iad2.fedoraproject.org does not provide KRA.                    
Custodia uses 'ipa02.iad2.fedoraproject.org' as master peer.                                        

This program will setup Dogtag KRA for the IPA Server.                                              

Configuring KRA server (pki-tomcatd). Estimated time: 2 minutes                                     
  [1/10]: creating ACIs for admin                                                                   
  [2/10]: creating installation admin user                                                          
  [3/10]: configuring KRA instance                                                                  
Failed to configure KRA instance                                                                    
See the installation logs and the following files/directories for more information:                 
  [error] RuntimeError: KRA configuration failed.                                                   

Your system may be partly configured.                                                               
If you run into issues, you may have to re-install IPA on this server.                              

KRA configuration failed.                                                                           
The ipa-kra-install command failed. See /var/log/ipaserver-kra-install.log for more information   

Looking at the designated log file I see:

Exception: PKI subsystem 'KRA' for instance 'pki-tomcat' already exists!
  File "/usr/lib/python3.6/site-packages/pki/server/pkispawn.py", line 575, in main
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/scriptlets/initialization.py", line 163, in spawn
  File "/usr/lib/python3.6/site-packages/pki/server/deployment/pkihelper.py", line 837, in verify_subsystem_does_not_exist

Is there a way to remove the KRA configuration from Tomcat? Maybe @cheimes knows?

IPA has no uninstaller for KRA. There is no supported way to remove KRA services from a cluster.

[backlog refinement]
We need to fix the KRA to continue forward with the backups. We would appreciate the help of IPA folks on this, so the best way forward is probably to file a ticket on IPA tracker.
@abompard Do you know where we can do that?

[backlog refinement]
We could try to remove KRA from ipa01, but there isn't any obvious way to do it.
Maybe @abompard could help here.

Note that this now only happens on ipa03. FOr some reason 02 and 02.stg are replicating KRA fine now.

[backlog refinement]
The situation is still the same. There is a plan to update IPA machines to RHEL9, maybe this will solve the issue.

I'm going to try and do staging when we are in f38 final freeze (then, if it goes well we can do prod after f38 release).

@kevin Did you have time to try this?

Nope, too much unplanned work/other work. :(

It's still definitely on my list tho... hopefully next week?

Login to comment on this ticket.

Boards 1
ops Status: Backlog