#4709 ipa-backup does not wait on 389-ds-base to stop and crash
Closed: Fixed None Opened 9 years ago by ksiddiqu.

ipa-backup failing with latest ipa build (ipa-server-4.1.0-4.el7.x86_64).

[root@dhcp207-124 ~]# rpm -q ipa-server 389-ds-base 
ipa-server-4.1.0-4.el7.x86_64
389-ds-base-1.3.3.1-6.el7.x86_64
[root@dhcp207-124 ~]#

[root@dhcp207-124 ~]# ipa-backup --logs
Preparing backup on dhcp207-124.testrelm.test
Stopping IPA services
Backing up ipaca in TESTRELM-TEST to LDIF
db2ldif failed: [10/Nov/2014:16:41:10 +051800] - /etc/dirsrv/slapd-TESTRELM-TEST/dse.ldif: nsslapd-maxdescriptors: nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 4096 (the current process limit).  Server will use a setting of 4096.
[10/Nov/2014:16:41:10 +051800] - Config Warning: - nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 4096 (the current process limit).  Server will use a setting of 4096.
[10/Nov/2014:16:41:11 +051800] - Unable to run db2ldif with the -r flag because the database is being used by another slapd process.
[10/Nov/2014:16:41:11 +051800] - Shutting down due to possible conflicts with other slapd processes

[Errno 2] No such file or directory: u'/var/lib/dirsrv/slapd-TESTRELM-TEST/ldif/TESTRELM-TEST-ipaca.ldif'
[root@dhcp207-124 ~]#

snip from /var/log/dirsrv/slapd-TESTRELM-TEST/errors:

=====================================================

[10/Nov/2014:16:41:40 +051800] SSL Initialization - SSL version range: min: TLS1.0, max: TLS1.2
[10/Nov/2014:16:41:40 +051800] - 389-Directory/1.3.3.1 B2014.284.06 starting up
[10/Nov/2014:16:41:40 +051800] - WARNING: changelog: entry cache size 1073740B is less than db size 1761280B; We recommend to increase the entry cache size nsslapd-cachememsize.
[10/Nov/2014:16:41:40 +051800] - I'm resizing my cache now...cache was 1677721 and is now 1342176
[10/Nov/2014:16:41:41 +051800] schema-compat-plugin - warning: no entries set up under cn=computers, cn=compat,dc=testrelm,dc=test
[10/Nov/2014:16:41:41 +051800] schema-compat-plugin - warning: no entries set up under cn=ng, cn=compat,dc=testrelm,dc=test
[10/Nov/2014:16:41:41 +051800] schema-compat-plugin - warning: no entries set up under ou=sudoers,dc=testrelm,dc=test
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=keys,cn=sec,cn=dns,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=groups,cn=compat,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=computers,cn=compat,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=ng,cn=compat,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target ou=sudoers,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=users,cn=compat,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=ad,cn=etc,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=casigningcert cert-pki-ca,cn=ca_renewal,cn=ipa,cn=etc,dc=testrelm,dc=test does not exist
[10/Nov/2014:16:41:41 +051800] NSACLPlugin - The ACL target cn=automember rebuild membership,cn=tasks,cn=config does not exist
[10/Nov/2014:16:41:41 +051800] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=testrelm,dc=test--no CoS Templates found, which should be added before the CoS Definition.
[10/Nov/2014:16:41:41 +051800] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=testrelm,dc=test--no CoS Templates found, which should be added before the CoS Definition.
[10/Nov/2014:16:41:41 +051800] - slapd started.  Listening on All Interfaces port 389 for LDAP requests
[10/Nov/2014:16:41:41 +051800] - Listening on All Interfaces port 636 for LDAPS requests
[10/Nov/2014:16:41:41 +051800] - Listening on /var/run/slapd-TESTRELM-TEST.socket for LDAPI requests
[10/Nov/2014:16:41:43 +051800] - slapd shutting down - signaling operation threads - op stack size 1 max work q size 2 max work q stack size 2
[10/Nov/2014:16:41:43 +051800] - slapd shutting down - closing down internal subsystems and plugins
[10/Nov/2014:16:41:43 +051800] - Waiting for 4 database threads to stop
[10/Nov/2014:16:41:44 +051800] - Config Warning: - nsslapd-maxdescriptors: invalid value "8192", maximum file descriptors must range from 1 to 4096 (the current process limit).  Server will use a setting of 4096.
[10/Nov/2014:16:41:44 +051800] - Unable to run db2ldif with the -r flag because the database is being used by another slapd process.
[10/Nov/2014:16:41:44 +051800] - Shutting down due to possible conflicts with other slapd processes
[10/Nov/2014:16:41:44 +051800] - All database threads now stopped
[10/Nov/2014:16:41:44 +051800] - slapd shutting down - freed 2 work q stack objects - freed 2 op stack objects
[10/Nov/2014:16:41:44 +051800] - slapd stopped.

Comments from our devel discussion:

  • not sure if it is possible in systemd
  • As a workaround, IPA could just check if the process runs or not (signal number |0|, like $ kill -0 $pid #)
    • use waitpid(2) to wait for a signal of ns-slapd dying before we ask the service to shutdown
      • need to do some race checking though
      • be sure to handle |EINTR|

I looked into it a bit more, and found out that the status of the systemd service is 'deactivating' from the systemctl stop call to the point where the process exits.

I would hope systemd's cgroups magic is more robust than the waitpid/EINTR solution. So IMO in IPA We should do the same thing that we do when starting a service: wait for the status.

Currently systemd guesses the PID to wait on, to make it explicit there should be a PIDFile=/var/run/dirsrv/slapd-%i.pid line in the /usr/lib/systemd/system/dirsrv@.service file.

+1, I would also prefer this approach, it is more consistent with what we already do. Please file bug for PKI to update their systemd service file.

Hm, it turns out systemctl start dirsrv@$INST.service waits until the service is started, systemctl stop dirsrv@$INST.service waits until it's down (i.e. the process exits), systemctl start dirsrv.target waits until all the "wanted" services are started, but systemctl stop dirsrv.target returns immediately.

This confuses me; I'll ask someone more knowledgeable about systemd why this happens.

At this point the workaround seems to be to on systemctl stop on the service, not the target. (Or in addition to the target.)

master:

  • e60ef1f ipaplatform: Use the dirsrv service, not target

ipa-4-1:

  • 082485c ipaplatform: Use the dirsrv service, not target

Metadata Update from @ksiddiqu:
- Issue assigned to pviktori
- Issue set to the milestone: FreeIPA 4.1.2

7 years ago

Login to comment on this ticket.

Metadata