Started the CentOS Atomic Host vagrant box I have lying around on my system. It was the 7.1711 version. It starts fine but after I upgrade docker service doesn't come up. I see two issues:
7.1711
-bash-4.2# systemctl status docker-storage-setup -o cat | tee ● docker-storage-setup.service - Docker Storage Setup Loaded: loaded (/usr/lib/systemd/system/docker-storage-setup.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2018-05-16 01:30:26 UTC; 10min ago Process: 719 ExecStart=/usr/bin/container-storage-setup (code=exited, status=1/FAILURE) Main PID: 719 (code=exited, status=1/FAILURE) Starting Docker Storage Setup... ERROR: Storage is already configured with devicemapper driver. Can't configure it with overlay2 driver. To override, remove /etc/sysconfig/docker-storage and retry. docker-storage-setup.service: main process exited, code=exited, status=1/FAILURE Failed to start Docker Storage Setup. Unit docker-storage-setup.service entered failed state. docker-storage-setup.service failed.
-bash-4.2# systemctl status docker -o cat | tee ● docker.service - Docker Application Container Engine Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/docker.service.d └─flannel.conf Active: failed (Result: exit-code) since Wed 2018-05-16 01:30:27 UTC; 9min ago Docs: http://docs.docker.com Process: 795 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --seccomp-profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES (code=exited, status=1/FAILURE) Main PID: 795 (code=exited, status=1/FAILURE) Starting Docker Application Container Engine... time="2018-05-16T01:30:26.496375725Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found" time="2018-05-16T01:30:26.499243981Z" level=info msg="libcontainerd: new containerd process, pid: 925" Error starting daemon: error initializing graphdriver: devicemapper: Non existing device atomicos-docker--pool docker.service: main process exited, code=exited, status=1/FAILURE Failed to start Docker Application Container Engine. Unit docker.service entered failed state. docker.service failed.
We probably need to address these issues.
Here is the status information:
-bash-4.2# rpm-ostree status State: idle Deployments: ● centos-atomic-host:centos-atomic-host/7/x86_64/standard Version: 7.1803 (2018-04-03 12:35:38) Commit: cbb9dbf9c8697e9254f481fff8f399d6808cecbed0fa6cc24e659d2f50e05a3e GPGSignature: Valid signature by 64E3E7558572B59A319452AAF17E745691BA8335 centos-atomic-host:centos-atomic-host/7/x86_64/standard Version: 7.1711 (2017-11-28 11:43:40) Commit: 86d991cbb122af96a96cf2c55ccf1bb778c2342dd9a444dfed4fe96f70bb0ef9 GPGSignature: Valid signature by 64E3E7558572B59A319452AAF17E745691BA8335
@jasonbrooks or @miabbott - can you take a look?
Metadata Update from @dustymabe: - Issue assigned to jasonbrooks - Issue tagged with: CentOS7, host
related: https://github.com/projectatomic/container-storage-setup/issues/267
In general, I'm not sure about filing atomic-wg issues that relate to CentOS because we can't generally fix them without going through RHEL, right?
Anyways https://bodhi.fedoraproject.org/updates/FEDORA-2018-03bdc0733a is the Fedora version of the first one. This fix needs to go through the whole downstream thing.
For the pool issue...I think that deserves a bug against docker in RHEL? Though of course that sort of wants to have reproduced it against RHELAH first.
That said it's definitely possible with some of these things that it's CentOS-AH specific as some of the metadata gets important.
Metadata Update from @dustymabe: - Issue tagged with: bug
You are right, but I think it's mostly a philosophical question. Filing issues here helps bring awareness to the issues I believe. For me things get too easily lost in bugzilla.
+1, thanks
probably
Right. To me this is as good a place as any to "route" issues and find the most effective way to get them resolved. Whether they are specific to Fedora, specific to CentOS, or specific to RHELAH and need their own BZ. Maybe I'm created too much overhead. Maybe not..
I understand that why docker-storage-setup error message is coming and how upgrade will fix that. What I don't understand is that why docker is complaining that "atomicos-docker--pool" does not exist.
If storage is already setup, then this thin pool should have come up automatically after reboot. If its not there, then we are looking at a different issue.
What's the output of "lvs", "vgs", "pvs" and "lsblk" command on the system.
I understand that why docker-storage-setup error message is coming and how upgrade will fix that. What I don't understand is that why docker is complaining that "atomicos-docker--pool" does not exist. If storage is already setup, then this thin pool should have come up automatically after reboot. If its not there, then we are looking at a different issue. What's the output of "lvs", "vgs", "pvs" and "lsblk" command on the system.
just tried to reproduce and I only see the first error this time and the docker service itself seems to come up ok.. Maybe a race condition? I can try to play around and see if I can repro.
and just rebooting that same host allows me to reproduce:
-bash-4.2# reboot Connection to 192.168.121.193 closed by remote host. Connection to 192.168.121.193 closed. $ $ vagrant ssh Last login: Mon May 21 15:22:08 2018 from 192.168.121.1 [vagrant@vanilla-c7atomic ~]$ [vagrant@vanilla-c7atomic ~]$ sudo su - Last login: Mon May 21 15:22:10 UTC 2018 on pts/0 -bash-4.2# systemctl status docker -o cat | tee ● docker.service - Docker Application Container Engine Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/docker.service.d └─flannel.conf Active: failed (Result: exit-code) since Mon 2018-05-21 15:25:53 UTC; 9min ago Docs: http://docs.docker.com Process: 806 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --seccomp-profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES (code=exited, status=1/FAILURE) Main PID: 806 (code=exited, status=1/FAILURE) Starting Docker Application Container Engine... time="2018-05-21T15:25:52.960602241Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found" time="2018-05-21T15:25:52.965747392Z" level=info msg="libcontainerd: new containerd process, pid: 834" Error starting daemon: error initializing graphdriver: devicemapper: Non existing device atomicos-docker--pool docker.service: main process exited, code=exited, status=1/FAILURE Failed to start Docker Application Container Engine. Unit docker.service entered failed state. docker.service failed. -bash-4.2# -bash-4.2# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool atomicos twi-a-t--- 2.68g 0.71 0.33 root atomicos -wi-ao---- <2.93g -bash-4.2# vgs VG #PV #LV #SN Attr VSize VFree atomicos 1 2 0 wz--n- 9.70g <4.07g -bash-4.2# pvs PV VG Fmt Attr PSize PFree /dev/vda2 atomicos lvm2 a-- 9.70g <4.07g -bash-4.2# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 252:0 0 11G 0 disk ├─vda1 252:1 0 300M 0 part /boot └─vda2 252:2 0 9.7G 0 part ├─atomicos-root 253:0 0 3G 0 lvm /sysroot ├─atomicos-docker--pool_tmeta 253:1 0 12M 0 lvm │ └─atomicos-docker--pool 253:3 0 2.7G 0 lvm └─atomicos-docker--pool_tdata 253:2 0 2.7G 0 lvm └─atomicos-docker--pool 253:3 0 2.7G 0 lvm vdb 252:16 0 20G 0 disk -bash-4.2#
upstream discussion about this race condition: https://github.com/projectatomic/container-storage-setup/issues/267#issuecomment-390697519
Log in to comment on this ticket.