#9071 Cannot log in to test machines
Closed: Fixed 3 years ago by kevin. Opened 3 years ago by thm.

Trying to log-in via SSH e.g. into rawhide-test.fedorainfracloud.org does not work for me, it yields

thm@rawhide-test.fedorainfracloud.org: Permission denied (publickey).

The same SSH key works fine e.g. on fedorapeople.org.


Metadata Update from @mohanboddu:
- Issue priority set to: Waiting on Assignee (was: Needs Review)
- Issue tagged with: groomed, medium-gain, medium-trouble

3 years ago

ok. I think I have this fixed.

It was a combo of a bug in fas-clients that we fixed for our f32 hosts, but not rawhide. (I built it for rawhide and updated this machine).
and
/home/fedora being the homedir of the cloud 'fedora' user, and also the directory where all the fedora users live. :)

Please test and if you still see anything re-open!

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Here's the result of a quick test with ansible -a "uname -r" for all machines listed here: https://fedoraproject.org/wiki/Test_Machine_Resources_For_Package_Maintainers

Name Result
rawhide-test.fedorainfracloud.org works
f32-test.fedorainfracloud.org works
f31-test.fedorainfracloud.org Connection closed by 34.220.107.234 port 22
f30-test.fedorainfracloud.org works
el7-test.fedorainfracloud.org Permission denied (publickey).
el6-test.fedorainfracloud.org Permission denied (publickey).
ppc64le-test.fedorainfracloud.org works
armv7-test01.fedorainfracloud.org Connection timed out
armv7-test02.fedorainfracloud.org Connection timed out
aarch64-test01.fedorainfracloud.org Connection timed out
aarch64-test02.fedorainfracloud.org Connection timed out

Metadata Update from @thm:
- Issue status updated to: Open (was: Closed)

3 years ago

I fixed el6/el7 a few days ago... f31 I just fixed. f30 is eol and I dropped it.

the arm ones are down due to the datacenter move.

Can you check them again and close if they are all ok (aside the arm ones)

@mobrien is working on f32-test, and the arm ones are down due to DC move, so I am going to close this one... please re-open or file a new ticket if you can't get to any (except those that are down)

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Trying to login into el8-test.fedorainfracloud.org and aarch64-test01.fedorainfracloud.org yields "Permission denied (publickey)."

Metadata Update from @thm:
- Issue status updated to: Open (was: Closed)

3 years ago

Yeah, seems to be some issue with fasClient sadly.

Perhaps @mobrien or @pingou could investigate more.

@pingou could you look since you looked the last time this happened?

It seems to be also happening on some of our newer instances like proxies (also f33):

Subject: Cron <root@proxy03> /usr/local/bin/lock-wrapper fasClient "/bin/sleep $(($RANDOM % 3600)); /usr/bin/fasClient -i |& grep -vi deprecation | /usr/local/bin/nag-once fassync
        1d 2>&1"

lstat(/home/fedora/bodanel/.ssh) failed: No such file or directory
Traceback (most recent call last):
  File "/usr/bin/fasClient", line 937, in <module>
    fas.create_ssh_keys(users)
  File "/usr/bin/fasClient", line 725, in create_ssh_keys
    selinux.restorecon(ssh_dir.decode('utf-8'), recursive=True)
  File "/usr/lib64/python3.9/site-packages/selinux/__init__.py", line 94, in restorecon
    selinux_restorecon(os.path.expanduser(path), restorecon_flags)
  File "/usr/lib64/python3.9/site-packages/selinux/__init__.py", line 457, in selinux_restorecon
    return _selinux.selinux_restorecon(pathname, restorecon_flags)
FileNotFoundError: [Errno 2] No such file or directory

So, I suspect selinux or authconfig or something. In the end we are going to have to figure out how we want to setup auth for the maintainer-test instances when we move everything else to sssd and ipa.

So it fails restoring the SELinux context on the folder: /home/fedora/bodanel/.ssh/ that does not exist on disk.

I have a "patch" that just by-passes that call when the folder is not found (around line 721):

            if have_selinux:
                username = self.users[uid]['username']
                ssh_dir = to_bytes(os.path.join(home_dir_base, username, '.ssh'))
                if os.path.exists(ssh_dir):
                    log.debug('Restoring SElinux context on %s' % ssh_dir)
                    selinux.restorecon(ssh_dir.decode('utf-8'), recursive=True)
                else:
                    log.debug("Skipping restoring SELinux on %s, folder not found", ssh_dir)

Now no idea why the .ssh folder doesn't exist

Should I patch the rpm?

I've looked at el8-test.fedorainfracloud.org, the error is:

Traceback (most recent call last):
  File "/usr/bin/fasClient", line 934, in <module>
    fas.create_ssh_keys(users)
  File "/usr/bin/fasClient", line 708, in create_ssh_keys
    pw = pwd.getpwuid(int(uid))
KeyError: 'getpwuid(): uid not found: 185131'

However, the uid was changing at every run. I tried disabling SELinux but the error persisted.
It looks like the users are not being created, su - cverna returns user cverna does not exist

Looking at aarch64-test01.fedorainfracloud.org the traceback is:

lstat(/home/fedora/kpvdr/.ssh) failed: No such file or directory
Traceback (most recent call last):
  File "/usr/bin/fasClient", line 937, in <module>
    fas.create_ssh_keys(users)
  File "/usr/bin/fasClient", line 725, in create_ssh_keys
    selinux.restorecon(ssh_dir.decode('utf-8'), recursive=True)
  File "/usr/lib64/python3.9/site-packages/selinux/__init__.py", line 94, in restorecon
    selinux_restorecon(os.path.expanduser(path), restorecon_flags)
  File "/usr/lib64/python3.9/site-packages/selinux/__init__.py", line 457, in selinux_restorecon
    return _selinux.selinux_restorecon(pathname, restorecon_flags)
FileNotFoundError: [Errno 2] No such file or directory

I've applied the same patch on as proxy13 (cf https://pagure.io/fedora-infrastructure/issue/9071#comment-714826 ) and fasClient completed fine there

Sorry for the delay here. They should all be fixed now. :)

Please re-open or file a new ticket if you see anything further having issues.

Metadata Update from @kevin:
- Issue close_status updated to: Fixed
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata