Controlling permissions in the AWS cloud

Cloud security is considered one of the most difficult challenges to tackle for organizations. Compared to on-premise systems, where the perimeter is often secured by physically limiting access over the network – think of a private network protected by firewalls, the cloud allows for another form altogether. In the cloud, resources are always accessible from the internet but are protected by a system that determines who you are and what you can do. While the same could be done on-premises, the cloud has made it easier and more accessible. It does however mean that security heavily relies on the Identity-and-Access-Management (IAM) system. This provides both benefits and challenges for organizations and developers alike.

How traditional on-premise security differs from the internet-accessible approach in the cloud.

When access is determined by a permission model it becomes important to specify these permissions accurately and deliberately, because failure to do so opens the organization to risks such as unexpected costs and potentially a security breach. A best practice is to adhere to the principle of least-privilege access: always specify just enough permissions necessary for a user or resource to perform its legitimate purpose, nothing more. This may sound like a simple task, which it arguably is, but in practice there is a rather big elephant in the room to tackle: privilege escalation.

One might expect that a developer cannot grant himself or an AWS service permissions he or she does not possess, e.g. using an expensive service.

The problem at hand is that for developers to apply the principle of least privilege they must have the ability to create these tailored identities and permissions themselves. In practice this means that they can create roles that have more permissions than they were initially given, ergo they can effectively escalate their permissions. Now mind that this does not have to be a malicious or deliberate action. Developers can also create a service and grant it access that they should not have been able to grant. Counterintuitively, the application of least-privilege access may result in a higher risk for the cloud account-holder.

In reality, it is possible to create a role that grants permission to start services the developer cannot start himself.

The good news is that we can mitigate this risk. The bad news is that there is no one-size-fits-all approach; choices will have to be made. As an organization we can deal with this problem in three distinct ways: we can let go of the principle of least privilege and make developers use pre-defined roles; we can use permission boundaries to limit what developers can do and we can apply service control policies on the whole account.

Pre-defining roles

The issue at hand is that developers can create roles with permissions at will, even if the permissions exceed what the developers can do themselves. Removing the ability to create roles and set permissions circumvents the problem in its entirety. Under these circumstances it is necessary to pre-define roles that can then be used, otherwise development becomes impossible.

It is important to realize that these roles will not be least-privilege. When pre-defining roles a larger set of permissions has to be included than strictly necessary to keep them usable by different teams for different products. This violates the best practice we have established: bespoke roles with only the required permissions for each user and service.

Pre-defined roles make it impossible for developers to assign more permissions than defined.

Using Permission Boundaries

Whenever an action is undertaken by a user or service, several kinds of policies are evaluated by AWS to determine whether access should be granted. Over time more types of policies were added and in mid-2018 AWS introduced permission boundaries as a means of limiting the permissions that could be granted to users or services. For example, a role will not be able to grant permissions that are not allowed by the attached permissions boundary. Meaning that permissions granted can never exceed those in the permissions boundary.

While the addition of permission boundaries was a needed improvement, their usage had some ifs and buts. Two main issues can be identified: applying boundaries is a continuous and manual effort by developers, and privilege escalation is still possible unless an inconvenient naming convention is introduced.

After a permissions boundary is created and enforced, a developer is required to assign that boundary to every single role he or she creates. This adds to the mental overhead and does not make security simpler, worse, it makes it more cumbersome.

Yet even with a permissions boundary applied, if roles do not follow a specific naming convention that is enforced then developers can simply overwrite the boundary. The choice is between adding a tedious naming convention or letting go of some guarantees that we were trying to create with the boundary in the first place.

The boundary makes sure that roles cannot exceed the permissions defined in the boundary. This requires some choreography so that developers cannot create roles without permission boundaries.

Optional: Deep dive on using permission boundaries

This section provides a look into why permission boundaries can be described as cumbersome.

To make use of permission boundaries there are several things to take care of. First, a managed policy needs to be created by the IAM administrator to act as the permissions boundary. Next, the trusted developers are granted permissions to create and modify roles(1), such as creating a role (iam:CreateRole), attaching a policy to a role (iam:AttachRolePolicy), and changing an inline policy (iam:PutRolePolicy), but with two limitations:

  1. A "iam:PermissionsBoundary" IAM condition with the matching permissions boundary resource name is present.
  2. The resources adhere to a naming scheme invented to ensure we only modify “our own” roles.

Finally, the iam:PassRole action is granted on resources (i.e. roles) adhering to the naming scheme to, again, ensure we only pass roles that we created and thus contain a permissions boundary.

  1. See this link for which actions can be allowed only if the permissions boundary is present.


Applying service control policies

We can use an alternative approach that does not come with the mental overhead of permission boundaries, namely service control policies (SCP). Released at the start of 2017 with AWS Organizations, SCP’s are policies that you apply on top of an account. They require the use of AWS Organizations, a service that is used to manage AWS Accounts. SCP’s serve as a cage of sorts – one that nothing within an account can escape: no service, not even the administrator. Hence the name: control policies for services.

SCP’s allow us to keep using least-privilege bespoke roles without the possibility of escalating privileges to an unacceptable degree. Developers can not elevate their permissions, and they cannot grant services more access than is ultimately allowed by an SCP.

In that sense, SCP’s are a guardrail and not an absolute measure that removes all risk altogether. An SCP is like a pre-defined role: it defines in a coarse-grained manner what is acceptable and what is not. With it, developers still have to be trusted not to grant too much access to services they develop, but now they are guarded against accidental escalation. We believe that this is an acceptable middle-ground as the worst risks are mitigated without hindering developers in their work.

Service control policies define what is and isn’t allowed by any principal in the AWS Account.

Optional: deep dive on using SCP’s

Two flavors of SCP’s exist, namely the deny- and allow-list approach. A deny-list allows all actions by default (Allow *:*) and explicitly denies specific ones, while an allow-list denies all actions (Deny *:*) and allows specific ones. Security-by-default obliges us to prefer the deny-list approach; the alternative is an eternal race between AWS adding services and IAM administrators denying them.

SCP’s are evaluated together with all other policies within an account that apply to a specific request. With a deny-list approach, any action that is not allowed within the SCP will be denied no matter what else is specified in other roles. Given an allow-list SCP an explicit deny is required within the SCP to deny an action.

While SCP’s look very similar to managed policies they carry several limitations that the former doesn’t. For example, the maximum size of an SCP is 5120 bytes including whitespace, and conditions and resources can only be specified when using a Deny effect. These are things to keep in mind when designing and developing SCP’s.

Summary

Security in the cloud is a different kind of challenge. Access is granted or denied using IAM permissions, making their definition important. To mitigate different kinds of risk in the cloud it is recommended to only grant the minimum amount of permissions to users and services. However, preventing privilege escalation is difficult when developers are allowed to create bespoke least privilege permissions.

We looked at three different approaches we can use to tackle this problem. First, we can use pre-defined roles and deny developers the ability to create least-privilege roles. This approach works, but misses the point of why we are doing this in the first place: we want to mitigate as much risk as possible by allowing tailored permissions per service.

Permission boundaries were introduced by AWS to tackle exactly the problem we are trying to solve. While they can be used to that end, they are a bit more cumbersome. First, the administrator has to pay close attention to the IAM permissions the developer is granted and ensure these play well with the required naming scheme. Furthermore, the developer is only allowed to create tailored roles if he ensures to attach a permissions boundary.

Then we got to service control policies. They make it possible to grant developers the freedom to create tailored permissions while preventing them from accidentally granting unacceptable permissions to a service. More importantly, this is done transparently and without any further effort required by developers. Doing this however requires an AWS Organization setup with accounts.

In practice, it is important to strike a balance between security and developer convenience. Pre-defined roles can potentially give greater guarantees that could be useful for demanding audits, while SCP’s are a practical and powerful tool to mitigate risk without sacrificing speed or usability. Permission boundaries feel to us like more cumbersome SCP’s, but they do have the advantage of not requiring an AWS Organization setup.

Ilia Awakimjan

I am a cloud engineer who is passionate about the cloud with a healthy dose of scrutiny. I love to share knowledge and volunteer to write and review exam questions for the AWS Solutions Architect Professional exam.