#4808 EC2 image creation process
Closed: Fixed None Opened 10 years ago by jforbes.

Kickstarts are hosted in the cloud-kickstarts repository on fedorahosted. We need to finish the process of making those images work with EC2. Specifically, those images need:
Post image modification for EC2
Bundle/upload/publish for EC2 S3 use
Modification of S3 image and snapshot from inside EC2 for EBS use
publish for EBS

Anything I can do to help get this tested?

Can you folks please give me a status update on the EC2 image creation process? I'm hearing through the grapevine that the image creation process is now working, but haven't seen an update to the ticket. Is the process working? Are images being published? Are things to the point mere mortals like myself can start testing?

Curious how this is coming along so i can relay info to folks at the cloud meeting today...

It needs testing, I have somr fires im fighting but will try amd get it done today

Has there been more progress on the AMIs? QA would like to have the F15 EC2 test day done before the F16 test days start picking up.

we have one outstanding issue right now, the disks are only being created ~1G when we need 10G

When would QA like to have the images by?

Replying to [comment:7 ausil]:

When would QA like to have the images by?

We would like to get the images by 8/1 if possible.

I spoke too soon. Since 8/1 is closer than I realized, the cloud-SIG decided to just go with the unofficial BG generated AMIs for the F15 EC2 test day and focus on getting koji AMIs for F16 alpha.

Dennis, can I get a status update here? Were you able to make any progress on the disk size issue?

FYI - per the cloud meeting last week, as Tim mentioned above, we're going to use BG-generated AMIs as the official images for F15. We hope to use koji generated images for F16 alpha and beyond. We have a few weeks left there, so any estimates on time can help us determine if that's going to be viable or not.

I'm still looking for a status update here -- I haven't seen a status update in a week. Were you able to get past the disk size issue? If so, can you tell me details about where the process is stuck?

Spent the weel working on F-16 alpha stuff. its not yet been looked at why the disk is only 1gb and not 10gb

I was able to get past this. The issue is related to https://bugzilla.redhat.com/show_bug.cgi?id=729340.

As part of the upload process an instance is started and an EBS volume attached to it. The image is then dd'd over ssh right to the block device exposed in the instance. Thanks to the bug, the device file was not being named what the script expected, so it was in effect, copying the image into /dev and running out of space.

At this point I believe I have a working upload script. The sample image I used did not appear to boot due to a systemd issue though:

[17516826.508852] systemd-getty-generator[223]: Failed to create symlink from /lib/systemd/system/serial-getty@.service to /run/systemd/generator/getty.target.wants/serial-getty@hvc0.service: File exists

so I think I may be using a bad image. If there is a known-working image in Koji I can certainly try with that.

What I'm seeing with the latest F16-alpha image in Koji: [4206579.167775] EXT4-fs (xvde1): mounted filesystem with ordered data mode. Opts: (null) [4206579.199180] dracut: Mounted root filesystem /dev/xvde1 [4206579.745896] dracut: Switching root [4206583.157101] type=1403 audit(1313686653.791:2): policy loaded auid=4294967295 ses=4294967295 [4206583.183965] systemd[1]: Successfully loaded SELinux policy in 1s 730ms 481us. [4206583.608260] systemd[1]: Successfully loaded SELinux database in 423ms 583us, size on heap is 462K. [4206585.201490] systemd[1]: Relabelled /dev and /run in 1s 563ms 88us. [4206585.729224] systemd[1]: systemd 33 running in system mode. (+PAM +LIBWRAP +AUDIT +SELINUX +SYSVINIT +LIBCRYPTSETUP; fedora) [4206585.736281] systemd[1]: Set hostname to <localhost.localdomain>. [4206585.829774] systemd-cryptse used greatest stack depth: 3856 bytes left [4206585.839216] systemd-getty-generator[223]: Failed to create symlink from /lib/systemd/system/serial-getty@.service to /run/systemd/generator/getty.target.wants/serial-getty@hvc0.service: File exists [4206585.840518] systemd[1]: /lib/systemd/system-generators/systemd-getty-generator exited with exit status 1. [4206586.318281] systemd[1]: Path /sys/kernel/security is already a mount point, refusing start for sys-kernel-security.automount [4206586.660457] udevd[228]: starting version 173 [4206587.447430] EXT4-fs (xvde1): re-mounted. Opts: (null) [4206588.874009] Initialising Xen virtual ethernet driver. [4206589.854839] systemd-random- used greatest stack depth: 3824 bytes left [4206589.963688] systemd-tmpfiles[338]: Successfully loaded SELinux database in 19ms 714us, size on heap is 464K. [4206590.091081] type=1400 audit(1313686660.725:3): avc: denied { write } for pid=338 comm="systemd-tmpfile" name="cache" dev=xvde1 ino=390916 scontext=system_u:system_r:systemd_tmpfiles_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir [4206590.091173] type=1400 audit(1313686660.725:4): avc: denied { add_name } for pid=338 comm="systemd-tmpfile" name="man" scontext=system_u:system_r:systemd_tmpfiles_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir [4206590.091294] type=1400 audit(1313686660.725:5): avc: denied { create } for pid=338 comm="systemd-tmpfile" name="man" scontext=system_u:system_r:systemd_tmpfiles_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir [4206590.094834] type=1400 audit(1313686660.729:6): avc: denied { relabelfrom } for pid=338 comm="systemd-tmpfile" name="man" dev=xvde1 ino=391675 scontext=system_u:system_r:systemd_tmpfiles_t:s0 tcontext=system_u:object_r:var_t:s0 tclass=dir [4206591.105615] type=1400 audit(1313686661.739:7): avc: denied { getattr } for pid=375 comm="modprobe" path="socket:[10320]" dev=sockfs ino=10320 scontext=system_u:system_r:insmod_t:s0 tcontext=system_u:system_r:init_t:s0 tclass=unix_stream_socket [4206591.303845] ip6_tables: (C) 2000-2006 Netfilter Core Team [4206591.361326] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [4206591.662677] [4206591.662679] ============================================= [4206591.662699] [ INFO: possible recursive locking detected ] [4206591.662706] 3.0.0-1.fc16.x86_64 #1 [4206591.662712] --------------------------------------------- [4206591.662719] systemd-logind/356 is trying to acquire lock: [4206591.662726] (&ep->mtx){+.+.+.}, at: [<ffffffff8116ee12>] ep_scan_ready_list+0x3a/0x19f [4206591.662747] [4206591.662748] but task is already holding lock: [4206591.662756] (&ep->mtx){+.+.+.}, at: [<ffffffff8116f38c>] sys_epoll_ctl+0x120/0x51d [4206591.662770] [4206591.662770] other info that might help us debug this: [4206591.662778] Possible unsafe locking scenario: [4206591.662779] [4206591.662787] CPU0 [4206591.662791] ---- [4206591.662795] lock(&ep->mtx); [4206591.662802] lock(&ep->mtx); [4206591.662809] [4206591.662809] *** DEADLOCK *** [4206591.662810] [4206591.662818] May be due to missing lock nesting notation [4206591.662820] [4206591.662828] 2 locks held by systemd-logind/356: [4206591.662835] #0: (epmutex){+.+.+.}, at: [<ffffffff8116f344>] sys_epoll_ctl+0xd8/0x51d [4206591.662850] #1: (&ep->mtx){+.+.+.}, at: [<ffffffff8116f38c>] sys_epoll_ctl+0x120/0x51d [4206591.662865] [4206591.662866] stack backtrace: [4206591.662874] Pid: 356, comm: systemd-logind Not tainted 3.0.0-1.fc16.x86_64 #1 [4206591.662883] Call Trace: [4206591.662894] [<ffffffff8108a375>] __lock_acquire+0x917/0xcf7 [4206591.662905] [<ffffffff8100af3a>] ? dump_trace+0x2fe/0x30d [4206591.662915] [<ffffffff81006399>] ? xen_force_evtchn_callback+0xd/0xf [4206591.662925] [<ffffffff81006942>] ? check_events+0x12/0x20 [4206591.662933] [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f [4206591.662942] [<ffffffff8108abe2>] lock_acquire+0xbf/0x103 [4206591.662951] [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f [4206591.662960] [<ffffffff81087ea6>] ? save_trace+0x3d/0xa7 [4206591.662969] [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f [4206591.662978] [<ffffffff8116e8bf>] ? ep_remove+0xb4/0xb4 [4206591.662988] [<ffffffff814dc8f2>] __mutex_lock_common+0x4c/0x361 [4206591.662997] [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f [4206591.663007] [<ffffffff81006399>] ? xen_force_evtchn_callback+0xd/0xf [4206591.663016] [<ffffffff81006942>] ? check_events+0x12/0x20 [4206591.663025] [<ffffffff8108986b>] ? mark_lock+0x2d/0x220 [4206591.663033] [<ffffffff81006942>] ? check_events+0x12/0x20 [4206591.663042] [<ffffffff8116e8bf>] ? ep_remove+0xb4/0xb4 [4206591.663050] [<ffffffff814dcd16>] mutex_lock_nested+0x40/0x45 [4206591.663059] [<ffffffff8116ee12>] ep_scan_ready_list+0x3a/0x19f [4206591.663068] [<ffffffff8116ef77>] ? ep_scan_ready_list+0x19f/0x19f [4206591.663077] [<ffffffff8116ef8e>] ep_poll_readyevents_proc+0x17/0x19 [4206591.663086] [<ffffffff8116eb9d>] ep_call_nested.constprop.3+0x90/0xcc [4206591.663095] [<ffffffff8116ecc2>] ep_eventpoll_poll+0x4e/0x5d [4206591.663103] [<ffffffff8116f49e>] sys_epoll_ctl+0x232/0x51d [4206591.663112] [<ffffffff8116ea58>] ? ep_set_mstimeout+0x49/0x49 [4206591.663122] [<ffffffff814e4fc2>] system_call_fastpath+0x16/0x1b

If I set the runlevel to 3 it boots to a login prompt, though the DEADLOCK message can still be seen. It's ami-d38e4eba in the AWS-Fedora account. (and it is still private)

ok, So we now have image building and uploading working fine, we have to get cloud-init into fedora to be able to produce working images. I believe that we will have working F-16 Beta images.

going to close now, we pushed f16 beta ec2 images.

Login to comment on this ticket.