#47429 dirsrv fails to start due to incorrect /var/run/lock lines in tmpfiles.d
Closed: wontfix None Opened 9 years ago by rmeggins.

Ticket was cloned from Red Hat Bugzilla (product Fedora): Bug 983073

Description of problem:
The lock directory is now symlinked to /var/run which is on tmpfs. The
tmpfiles.d configuration that dirsrv puts in place (see a similar bug #857939
but that was for /var/run/dirsrv and not /var/lock being moved to
/var/run/lock) specifies the old location of /var/lock/dirsrv and
/var/lock/dirsrv/slapd-EXAMPLE-COM but systemd-tmpfiles does not follow the
symlink so the directory never gets created and has to be created by hand.

Version-Release number of selected component (if applicable):
389-ds-base-1.3.1.3-1.fc19.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Clean install of F19
2. Install 389-ds (I used freeIPA to speed up the process and developing
against freeIPA is where I bumped into this).
3. Observe dirsrv starts fine.
4. Reboot system
5. Observe dirsrv fails to start
6. Add lines to /etc/tmpfiles.d/dirsrv-EXAMPLE-COM.conf to create
/var/run/lock/* instead of /var/lock/*
7. Reboot system
8. Observe right directories are created and dirsrv instance runs

Actual results:
dirsrv fails to start after a reboot

Expected results:
dirsrv starts with no manual intervention

Cannot reproduce - steps
1) F19 x86_64 system - installed 389-ds-base-1.3.1.3
2) ran setup-ds.pl to create an instance
3) checked contents of /etc/tmpfiles.d/dirsrv-inst.conf
/var/run/dirsrv
/var/lock/dirsrv
/var/lock/dirsrv/slapd-inst
4) configured instance to start at boot time
http://port389.org/wiki/Howto:systemd#How_do_I_make_it_start_at_boot_time.3F
systemctl enable dirsrv.target # start all instances at boot time
5) reboot

upon reboot, the directory server process is running - /var/run/dirsrv and /var/lock/dirsrv are correctly populated - systemctl status dirsrv@inst.service reports running

I did not use IPA - is this perhaps an IPA specific problem? Can you try with plain 389-ds-base?

I just created a KVM VM - 389dstest.example.com ...

Steps I took:
1) Install F19 (minimal with guest extensions) and after the initial install did yum install 389-ds-base.
2) setup-ds.pl choosing option 1 (express) when prompted
3) verified the contents of /etc/tmpfiles.d/dirsrv-389dstest.conf

  d /var/run/dirsrv 0770 nobody nobody
  d /var/lock/dirsrv 0770 nobody nobody
  d /var/lock/dirsrv/slapd-389dstest 0770 nobody nobody

4) tried the various matrix of service states for manual start or start at boot
5) reboot

observe /var/lock/ does not have a dirsrv or dirsrv/inst directory and 389ds instance cannot be started...

I just tried something else though - manually ran systemd-tmpfiles --create after boot was completed and the /var/lock/* directories are indeed created...

I'm beginning to wonder if this is some strange race condition on boot for systemd-tmpfiles when traversing the symlinks (perhaps the double symlink for /var/lock as opposed to the single for /var/run) to the final destination to create the directories.

Hmm rmeggins can you please close as NOTABUG?

After more testing and simplifying of parts I'm sure this is a systemd-tmpfiles issue of some nature ... and I can only intermittently reproduce this with a minimal F19 system and manually putting stuff in /etc/tmpfiles.d without installing anything ... and only on btrfs - ext4 seems fine.

I'll look to see if I can reliably get a set of steps in place and file a bug with systemd if I can.

Replying to [comment:2 jhogarth]:

I just created a KVM VM - 389dstest.example.com ...

Steps I took:
1) Install F19 (minimal with guest extensions) and after the initial install did yum install 389-ds-base.
2) setup-ds.pl choosing option 1 (express) when prompted
3) verified the contents of /etc/tmpfiles.d/dirsrv-389dstest.conf

  d /var/run/dirsrv 0770 nobody nobody
  d /var/lock/dirsrv 0770 nobody nobody
  d /var/lock/dirsrv/slapd-389dstest 0770 nobody nobody

4) tried the various matrix of service states for manual start or start at boot

I'm not sure what this means - did you do systemctl enable dirsrv.target?

5) reboot

observe /var/lock/ does not have a dirsrv or dirsrv/inst directory and 389ds instance cannot be started...

I just tried something else though - manually ran systemd-tmpfiles --create after boot was completed and the /var/lock/* directories are indeed created...

I'm beginning to wonder if this is some strange race condition on boot for systemd-tmpfiles when traversing the symlinks (perhaps the double symlink for /var/lock as opposed to the single for /var/run) to the final destination to create the directories.

jhogarth: did you ever file this? because I seem to be running into exactly this on my newly installed F19 freeipa server, which is a VM. Every time I boot there is no /var/lock/dirsrv, I have these errors in journalctl:

Sep 26 20:31:26 id.happyassassin.net systemd-tmpfiles[212]: Failed to create directory /var/lock/dirsrv: No such file or directory
Sep 26 20:31:26 id.happyassassin.net systemd-tmpfiles[212]: Failed to create directory /var/lock/dirsrv/slapd-HAPPYASSASSIN-NET: No such file or directory

and dirsrv fails to start. If I run 'systemd-tmpfiles --create' manually and do 'systemctl restart ipa.service', it starts up successfully.

ab from #freeipa IRC mentioned he also sees this issue sometimes. it's definitely a real problem somewhere. I have disabled ipa.service and done this icky hack to deal with it on my deployment for now:

[adamw@id ~]$ cat /etc/rc.d/rc.local

!/bin/sh

systemd-tmpfiles --create /etc/tmpfiles.d/dirsrv-HAPPYASSASSIN-NET.conf
systemctl start ipa.service

It looks like the point where rc.local runs is late enough that this works pretty reliably. At least, it worked twice in a row for me. I call that good. :P

To aid anyone else who falls into the same bear trap, there's a problem with simply calling 'systemd-tmpfiles --create': it will create a file /var/run/nologin which will prevent any users other than root from logging into the system (locally or via ssh) with a rather cryptic message:

"System is booting up."

Any time you see that error, the reason is that /var/run/nologin exists. For this particular scenario, calling "systemd-tmpfiles --create /etc/tmpfiles.d/dirsrv-HAPPYASSASSIN-NET.conf" as listed above will create the dir we want without hitting the directive that creates /var/run/nologin. If anyone winds up here by Googling "System is booting up.": you want to get rid of /var/run/nologin. You're welcome. ;)

This might be related to a recently fixed issue related to the /var/run symlinks not being created before 389 DS tries to access them on boot. See ticket 47513 for details.

...which came from https://bugzilla.redhat.com/show_bug.cgi?id=996716 , which I was just pointed at in IRC. This is to ensure anyone showing up to the party late has all the necessary links!

This happens to me on two separate F19 FreeIPA installations (one set up to be replica of the other); I'm currently editing /etc/tmpfiles.d/dirsrv-.conf and replacing /var/lock/ with /var/run/*

Correction: I had to add this line:

{{{
d /var/run/dirsrv 0770 dirsrv dirsrv
}}}

to the configuration file; the two /var/lock entries work fine as they are. Is this the same issue or a different one then? Apologies if I'm mixing things up

Replying to [comment:11 salimma]:

Correction: I had to add this line:

{{{
d /var/run/dirsrv 0770 dirsrv dirsrv
}}}

to the configuration file; the two /var/lock entries work fine as they are. Is this the same issue or a different one then? Apologies if I'm mixing things up

Have you tried the latest Fedora 19 389-ds-base-1.3.1.11 package? It should have fixed this issue. It won't correct existing tmpfiles.d entries but it should create correct tmpfiles.d entries for new instances.

Metadata Update from @adamwill:
- Issue set to the milestone: N/A

5 years ago

Hello, I'm getting the same issue. This is on CentOS 8 but I suppose it should all be the same. Hoping the following can shed some light, but not really getting any further than this:

# rpm -qa | grep 389-ds-base
389-ds-base-1.4.2.4-8.module_el8.2.0+366+71e3276f.x86_64
389-ds-base-libs-1.4.2.4-8.module_el8.2.0+366+71e3276f.x86_64

# cat /etc/tmpfiles.d/dirsrv-EXAMPLE-COM.conf 
d /var/run/dirsrv 0770 dirsrv dirsrv
d /var/lock/dirsrv/ 0770 dirsrv dirsrv
d /var/lock/dirsrv/slapd-EXAMPLE-COM 0770 dirsrv dirsrv

# ls -la /var/lock/ | grep dirsrv

# journalctl -fu systemd-tmpfiles-setup.service
#...
Jun 23 09:52:51 ipa1 systemd-tmpfiles[56]: Failed to create directory or subvolume "/var/lock/dirsrv": No such file or directory
Jun 23 09:52:51 ipa1 systemd-tmpfiles[56]: Failed to create directory or subvolume "/var/lock/dirsrv/slapd-EXAMPLE-COM": No such file or
 directory
#...

Oddly, as @adamwill mentioned, running systemd-tmpfiles --create manually works fine:

# systemd-tmpfiles --create
[/etc/tmpfiles.d/dirsrv-EXAMPLE-COM.conf:1] Line references path below legacy directory /var/run/, updating /var/run/dirsrv → /run/dirsrv; please update the tmpfiles.d/ drop-in file accordingly.

# echo $?
0

# ls -la /var/lock/ | grep dirsrv
drwxrwx---  3 dirsrv dirsrv  60 Jun 23 10:19 dirsrv

@renmare, when does this happen? On reboot or at some other time? Is it consistently reproducible?
It might be a race condition at startup when systemd mounts /run, that's why it also complains about legacy directories.

Metadata Update from @vashirov:
- Custom field reviewstatus adjusted to None (was: no review needed)
- Issue close_status updated to: None (was: Invalid)
- Issue set to the milestone: None (was: N/A)

2 years ago

Could certainly be. It happens on reboot so it fits, though other items are being mounted work fine:

# ls -la /var/lock/
total 0
drwxr-xr-x  5 root   root   100 Jun 23 20:18 .
drwxr-xr-x 26 root   root   780 Jun 23 20:18 ..
drwxrwx---  5 root   pkcs11 100 Jun 23 20:18 opencryptoki
drwxr-xr-x  2 root   root    40 Jun 23 20:18 subsys

# ls -la /var/run
lrwxrwxrwx 1 root root 6 Oct 16  2019 /var/run -> ../run

# ls -la /run
total 32
drwxr-xr-x 26 root     root      780 Jun 23 20:18 .
drwxr-xr-x 19 root     root     4096 Jun 23 20:18 ..
drwxr-xr-x  3 root     root      100 Jun 23 20:18 NetworkManager
-rw-------  1 root     root        0 Jun 23 20:18 agetty.reload
drwxr-xr-x  4 root     root       80 Jun 23 20:18 certmonger
-rw-------  1 root     root        3 Jun 23 20:18 certmonger.pid
drwxr-xr-x  2 root     root       40 Jun 23 20:18 console
----------  1 root     root        0 Jun 23 20:18 cron.reboot
drwx------  2 root     root       40 Jun 23 20:18 cryptsetup
drwxr-xr-x  2 custodia custodia   40 Jun 23 20:18 custodia
drwxr-xr-x  2 root     root       60 Jun 23 20:18 dbus
drwxrwx---  2 dirsrv   dirsrv     40 Jun 23 20:18 dirsrv
drwxr-xr-x  2 root     root       40 Jun 23 20:18 faillock
#.. loads of other items

# cat /usr/lib/tmpfiles.d/certmonger.conf
# certmonger uses libraries which may want to put temporary files in $TMPDIR,
# but SELinux policy won't let anything running as certmonger_t do that
d /var/run/certmonger 0755 root root

I've used @adamwill's workaround and it works fine so again, it fits the race condition. How can fix the issue though?

So I've just figured out a better solution that I wouldn't consider hacky. Adding the below line to /etc/tmpfiles.d/dirsrv-EXAMPLE-COM.conf fixes the issue:

d /run/lock/dirsrv/ 0770 dirsrv dirsrv -

One liner:

sed -i '2s#^#d /run/lock/dirsrv/ 0770 dirsrv dirsrv -\n#' /etc/tmpfiles.d/dirsrv-EXAMPLE-COM.conf

Thank you for testing this and confirming!

These conf files are generated during instance creation, so a proper fix would be to change our installer code to use /run/ instead of /var/run. There is an issue open to update these files: https://pagure.io/389-ds-base/issue/51039, we'll fix it in there.

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/766

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix

2 years ago

Login to comment on this ticket.

Metadata
Related Pull Requests