Wrong access permissions of authorized keys file and parent directory when using absolute AuthorizedKeysFile

Bug #1911680 reported by Kevin Rogers
260
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
High
Dan Watkins

Bug Description

[Impact]

For users who provide more than one unique SSH key to cloud-init and have a shared AuthorizedKeysFile configured in sshd_config, cloud-init 20.4 started writing all of these keys to such a file, granting all such keys SSH access as root. Previously, cloud-init did not handle such sshd_config configuration, and so would have written users' keys to the hardcoded `~/.ssh/authorized_keys` unconditionally.

[Test Case]

This test case has been written for use with the cloud-init integration testing framework:

```
ABSOLUTE_AUTH_KEYS_PATH = "/etc/ssh/authorized_keys"
BACKDOOR_KEYS = "/etc/backdoor_keys"
USER_DATA_TMPL = """\
#cloud-config
bootcmd:
- sed -i "s,#AuthorizedKeysFile.*,AuthorizedKeysFile {0} {1}," /etc/ssh/sshd_config
runcmd:
- |
    cat << EOF > {1}
    {{}}
    EOF
""".format(ABSOLUTE_AUTH_KEYS_PATH, BACKDOOR_KEYS)

# This test checks that we do not write keys into an absolute
# AuthorizedKeysFile by default, because doing so presents a security issue.
#
# Because we're intentionally testing that we do not write authorized keys to
# the location configured for sshd, we cannot SSH into the system using keys in
# that path. We use runcmd to populate a second configured path to grant us
# access; runcmd will install the key after all the SSH key determination has
# been completed, therefore not affecting it.

def test_test(session_cloud, setup_image):
    # We shouldn't write out absolute files until we've figured out how to do
    # it right
    user_data = USER_DATA_TMPL.format(
        session_cloud.cloud_instance.key_pair.public_key_content
    )
    with session_cloud.launch(user_data=user_data) as client:
        assert client.execute(
            "test -f {}".format(ABSOLUTE_AUTH_KEYS_PATH)
        ).failed
```

[Regression Potential]

This update touches only cloud-init's SSH handling, by reverting a commit which modified its behaviour. It has been present in Ubuntu images since the 6th of January.

Users who have started relying on the new behaviour (cloud-init writing keys to non-standard AuthorizedKeysFile locations configured in sshd_config) in the past ~2 weeks will see this behaviour stop working.

[Original Report]

Starting on the 6th January 2021 we started observing SSH authentication issues in AWS AMI builds.

We have SSH configured with an absolute (i.e. rather than per-user) authorised keys file, e.g.

   AuthorizedKeysFile /etc/ssh/authorized_keys

We observed that the file and parent folder permissions had been modified, to:

   /etc/ssh - 0700
   /etc/ssh/authorized_key - 0600

These permissions would be fine if the authorised keys file were in a users home directory, but not for a centrally owned absolute file.

We investigated and identified that between the 4th and 6th January 2021, cloud-init on Ubuntu 16.04 was upgraded from 20.3-2-g371b392c-0ubuntu1~16.04.1 to 20.4-0ubuntu1~16.04.1. The newer version included the following fix, that led to the problem, although prior commits set the scene for this commit to cause us trouble.

https://github.com/canonical/cloud-init/commit/b0e73814db4027dba0b7dc0282e295b7f653325c

While trying workarounds (e.g. oneshot service to revert permissions), we then ran into another change that appended exit(142) to the command option:

https://github.com/canonical/cloud-init/commit/e161059a18173e2b61c54dba9eab774401fb5f1f

which then meant, as root is disabled, that SSH would not work using the authorised key pair for any user. This is because cloud-init first writes the key for the user (e.g. ubuntu) and in our case writing the key to /etc/ssh/authorized_keys, then writes the key for the disabled root user to the same location, overwriting the previous write.

There are similarities to https://bugs.launchpad.net/cloud-init/+bug/1839061, but this is a different issue.

description: updated
Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi Kevin,

Thanks for using cloud-init, and for taking the time to file such a detailed bug! Apologies for having broken your workflow.

I can reproduce the issue that you're seeing locally, and I'm going to start digging into what we need to do to address it.

I have one clarifying question: when launching an image with an older cloud-init and sshd_config as described, I don't see /etc/ssh/authorized_keys created or populated. As best I can tell, cloud-init places keys in per-user .ssh/authorized_keys regardless of the AuthorizedKeysFile set in sshd_config. Does this match the behaviour you were seeing previously? (I believe this is the bug that we were setting out to fix in the commits you've linked, so I'm not necessarily surprised by this: I just want to make sure that we're all on the same page as far as expected behaviour goes.)

Thanks!

Dan

Changed in cloud-init:
assignee: nobody → Dan Watkins (oddbloke)
status: New → In Progress
importance: Undecided → High
Revision history for this message
Dan Watkins (oddbloke) wrote :

_IF_ it is the case that cloud-init didn't previously populate absolute AuthorizedKeysFiles, then I think we can split this into two pieces: one is to remediate the regression (i.e. go from writing broken absolute files to not writing absolute files again), the other is to introduce proper support for writing to absolute files.

This latter is perhaps more complicated than it seems on its face. Keys come to cloud-init in two ways: from the cloud's metadata, and from user-data. On some clouds (e.g. EC2), the key in metadata is attached to the system but on others (e.g. GCE), every key is tied to a specific user. In user-data, they can either be specified as a top-level key (implicitly the default user's key), or against specific users. It's not immediately clear to me what the appropriate behaviour is when confronted with an absolute/global AuthorizedKeysFile: some users will expect every key specified to the system to end up in there, whereas others might be (justifiably) surprised to see keys specified for non-root users granting SSH access to the root user. We'll need to think through these semantics carefully, I think.

(Another thought: a brief examination of the sshd AUTHORIZED_KEYS FILE FORMAT docs indicate that we can't limit an authorized key to a specific user in an absolute AuthorizedKeysFile. This affects the above conversation, but also means we can't disable root SSH login in the way that we usually do.)

Revision history for this message
Dan Watkins (oddbloke) wrote :

I've just opened https://github.com/canonical/cloud-init/pull/773. I don't expect this implementation to land in its current form, but it gives us something concrete to form our conversation around.

Dan Watkins (oddbloke)
information type: Public → Private Security
Revision history for this message
Dan Watkins (oddbloke) wrote :

Upon further thought and conversation, we've realised that this represents a security issue. If the following user-data is specified (note that the bootcmd is used to modify SSH's configuration to match the reported one; this happens before SSH starts, and before cloud-init performs any SSH modification and so allows for testing this bug without mastering a new image):

```
#cloud-config
bootcmd:
- sed -i "s,#AuthorizedKeysFile.*,AuthorizedKeysFile /etc/ssh/authorized_keys," /etc/ssh/sshd_configusers:
- name: test_user
  ssh_authorized_keys:
  - ssh-rsa <redacted key material>
```

and the SSH key specified for `test_user` is different to the default SSH key provided for the system, then /etc/ssh/authorized_keys will end up with this content:

```
ssh-rsa <test_user's key material>
no-port-forwarding,no-agent-forwarding,no-X11-forwarding,command="echo 'Please login as the user \"NONE\" rather than the user \"root\".';echo;sleep 10;exit 142" ssh-rsa <default key's material>
```

and SSHing to the system as root using test_user's key will succeed.

(It has to be as root because cloud-init modifies /etc/ssh and /etc/ssh/authorized_keys to only be accessible/readable by root, and sshd uses the connecting user's permissions to read their authorized keys. But sshd will accept the key for _any_ user that can read /etc/ssh/authorized_keys.)

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Let me start with a caveat that I'm fairly removed from operations and thus unfamiliar with common deployment styles.

Is it correct that openssh configured with AuthorizedKeysFile /etc/ssh/authorized_keys will allow any of the listed keys to log in as any user account that can read /etc/ssh/authorized_keys?

What happens if the sshd_config has multiple AuthorizedKeysFile settings, within different Match blocks? Admins could write the Match blocks to have different AllowUsers or AllowGroups, etc.

There seems to be so much flexibility in how these can be combined that trying to codify cloud-init handling of AuthorizedKeysFile settings different from the defaults may be quite difficult to get correct.

Thanks

Revision history for this message
Dan Watkins (oddbloke) wrote : Re: [Bug 1911680] Re: Wrong access permissions of authorized keys file and parent directory when using absolute AuthorizedKeysFile

Thanks Seth, your thoughts are much appreciated!

On Fri, Jan 15, 2021 at 12:03:49AM -0000, Seth Arnold wrote:
> Is it correct that openssh configured with AuthorizedKeysFile
> /etc/ssh/authorized_keys will allow any of the listed keys to log in as
> any user account that can read /etc/ssh/authorized_keys?

This is my understanding, yes.

> What happens if the sshd_config has multiple AuthorizedKeysFile
> settings, within different Match blocks? Admins could write the Match
> blocks to have different AllowUsers or AllowGroups, etc.

This is an excellent question, which I had not considered previously.
We'll need to consider these potential configurations when figuring out
how to reintroduce this support.

(It is perhaps worth noting that cloud-init only performs its SSH key
configuration at first boot: while it's certainly possible that people
could bake images containing complex configs (and we therefore need to,
at the very least, not break them), if they are applied by config
management, or manually, then they won't be present when the relevant
cloud-init code runs.)

> There seems to be so much flexibility in how these can be combined that
> trying to codify cloud-init handling of AuthorizedKeysFile settings
> different from the defaults may be quite difficult to get correct.

My understanding is that the problem the submitter was trying to fix is
the simple case where there is a single non-default AuthorizedKeysFile
setting, which applies globally. Provided we can (a) reliably identify
which AuthorizedKeysFile setting (if any) applies globally, and (b)
reliably identify that the value of the setting is user-specific, then I
think we can handle that particular case.

(a) will require more investigation, but (b) we can do, I think: if the
setting is relative or contains %h, %u or %U, then it is user-specific.

It is clear to me that we cannot provide generic handling of anything
but the most simple cases, without getting into a security minefield. If
folks want to apply more complex configs, then they should treat
cloud-init's authorized key support as bootstrapping instance access,
and apply such configurations separately. (Such application could happen
using `runcmd` or `write_files` cloud-init config, which would still
allow such SSH configurations to be applied without use of a separate
tool.)

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

The attached log contains a single failure; the cloud-init integration tests have a race condition around reboots which triggers on EC2, causing the failure. https://github.com/canonical/pycloudlib/pull/92 is in-progress work to address this test framework issue.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

This log exhibits the same failure as the groovy EC2 log.

Dan Watkins (oddbloke)
description: updated
Revision history for this message
Dan Watkins (oddbloke) wrote :

This contains a single failure: the static route configuration tests appears to be broken on my local LXD setup (I think because the network config specified conflicts with my LXD networking; I haven't had time to investigate).

Revision history for this message
Dan Watkins (oddbloke) wrote :

The attached log contains three test runs: we were seeing failure in instance teardown which isn't material to the actual testing, but I re-ran the failing tests (in some cases twice) to be sure.

Revision history for this message
Dan Watkins (oddbloke) wrote :

Same single failure as the other EC2 logs.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

Same failure as other EC2 runs, plus a test_upgrade failure (also due to the rebooting issue).

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

The attached log contains three test runs, required by our GCE flakiness. The NTP test continues to fail for reasons I haven't dug into: it isn't relevant to the code being SRU'd, so I don't believe it is a blocker.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

Two test runs in the attached log.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :

I have realised that the above regression testing runs do not include the [Test Case] testcase; I've run it separately on each series/platform and will now attach the results.

(As each test run is shorter, I'll combine all series into a single log file.)

Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Dan Watkins (oddbloke) wrote :
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Please give me a sign whenever you'd like this update to go out.

Dan Watkins (oddbloke)
information type: Private Security → Public Security
Revision history for this message
ret2libc (sirmy15) wrote :

Do you intend to assign a CVE for this security issue?

Revision history for this message
Richard Harding (rharding) wrote :

No, after consulting with our security team and the fact that this needed the user to perform out-of-the-way configuration of the system in a very specific way that is quite uncommon (allowing shared ssh keys on the system level vs user level, it was decided a CVE/USN was not necessary for this issue.

Revision history for this message
James Falcon (falcojr) wrote :

This was released in the 20.4.1 hotfix

Changed in cloud-init:
status: In Progress → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.