Blog Layout Test – Erik

The unexpected risks of using wildcards in IAM policies

Blog Layout Test – Erik

At Profit4Cloud, we’re a big fan of using AWS’ managed services to simplify the maintenance of the solutions we build.

A classic example of one of those services is SQS. You see it used in a lot of architectures, and with good reason. Last year, AWS, released a new feature involving SQS and AWS Lambda: triggering Lambda Functions from an SQS queue in another account.

But with this improvement came a change that might not have been intended. Many people set up the permissions on their SQS queues with a wildcard instead of an explicit account id, which suddenly meant that those queues were open to anyone.

There’s an easy solution to prevent issues like this in the future, but it’s an important one to realize: never use wildcards in your IAM policies.

Setting up cross-account SQS consumption

So, cross-account SQS queues. What is that all about?

We see most companies using multiple AWS accounts to isolate different workloads. This works great, as this gives us the highest level of isolation achievable at AWS: resources in different AWS account can never access each other by default.

Imagine a theoretical application, that can receive data from multiple sources, and which collects a daily report about all data sent. We have several AWS accounts that produce this data, and we want to process this in one central account.

We create an SQS queue in the producer accounts, that they can fill at their leisure. In the central account, we create a lambda function that will process all data and sign that function up as a processor of the SQS queue. All we needed to do was allow the execution role of the lambda function to access our SQS queue! We set all of this up using CloudFormation, push the code and deploy it. It’s all working perfectly!

Some time later, we notice messages are missing. Data we are putting on the queue just… Disappears? We verify it’s being added correctly, no errors are being thrown, but we do not get the messages in the central account.

After checking out the subscribers to the queue, we find the culprit: there’s a lambda function in an account that’s not ours that’s subscribed to the queue! How did this happen?

How we deployed this

First off, let’s see how we deployed this. This section contains implementation details, and you can freely skip this if you’re just interested in the particular issue, and how you should prevent issues like this in the future.

We set up the two accounts we need in the same AWS Organization using AWS Control Tower. While we developed these stacks using CDK as our Infrastructure as Code solution, we will post the code samples as pure CloudFormation as this is easier to carry over to any other Infrastructure as Code language.

Let’s create a Lambda function that can receive messages from an SQS queue. This application will log every message it receives:

# sqs_handler.py
def lambda_handler(event, context):
    for record in event['Records']:
        payload = record["body"]
        print(str(payload))

We deploy it through CloudFormation:

Description: "The stack containing an sqs-consuming lambda, deployed to the lambdas-account."

Parameters:
  QueuesAccountId:
    Type: Number
    Description: The ID of the account the queues are deployed in.
    
Resources:
  SqsHandlerLambda:
      Type: "AWS::Lambda::Function"
      Properties:
        Code:
          S3Bucket: "..."
          S3Key: "..."
        Role: !GetAtt ExecutionRole.Arn
        Handler: sqs_handler.handle
        Runtime: python3.9

  ExecutionRole:
    Type: "AWS::IAM::Role"
    Properties:
      RoleName: "SqsSecurityCrossAccountRole"
      ManagedPolicyArns:
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaSQSQueueExecutionRole"
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Action: "sts:AssumeRole"
            Effect: "Allow"
            Principal:
              Service: "lambda.amazonaws.com"
              
  SqsEventMapping:
    Type: "AWS::Lambda::EventSourceMapping"
    Properties:
      FunctionName: !Ref SqsHandlerLambda
      EventSourceArn: !Sub "arn:aws:sqs:eu-west-1:${QueuesAccountId}:DemonstrationSqsQueue"

As you can see in the last few lines, we add an EventSourceMapping to an SQS queue in another account whose ID we can pass through the CloudFormation parameters.

Since this is a demonstration of accessing a queue cross-account, we will deploy that queue through a separate cloudformation stack, so we can deploy that to another account. We will also include an IAM policy that describes the role named CrossAccountSQSLambdaRole is allowed to read this queue. Normally, we would refer to the IAM role created through a !ref function – but that won’t work between accounts!

Description: "The stack containing an SQS queue, deployed to the queue-account."

Resources:
  DemonstrationSqsQueue:
    Type: AWS::SQS::Queue
    Properties:
      QueueName: DemonstrationSqsQueue
    
  SqsQueuePolicy:
    Type: "AWS::SQS::QueuePolicy"
    Properties:
      Queues:
        - !Ref DemonstrationSqsQueue
      PolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal: "arn:aws:iam::ACCOUNT_ID_PLACEHOLDER:role/CrossAccountSQSLambdaRole"
            Action: "sqs:*"
            Resource: !GetAtt DemonstrationSqsQueue.Arn

The culprit

You might have noticed the placeholder ACCOUNT_ID_PLACEHOLDER in this piece of code in the previous section:

  SqsQueuePolicy:
    Type: "AWS::SQS::QueuePolicy"
    Properties:
      ...
        Statement:
          - Effect: "Allow"
            Principal: "arn:aws:iam::ACCOUNT_ID_PLACEHOLDER:role/CrossAccountSQSLambdaRole"
            Action: "sqs:ReceiveMessage"
            Resource: !GetAtt DemonstrationSqsQueue.Arn

So, what are we looking at?

A QueuePolicy describes who can interact with an SQS queue, and what they can do with it. It describes this using an IAM poliy. These policies are used everywhere in AWS to define who can interact with services and resources. In this case, we’re dealing with 4 of the most common fields: The Effect (either Allow or Deny), a Principal (the user, resource or service we’re either granting or denying access), an Action (what they can or can not do) and a Resource (what AWS resource they can or can not perform the action on).

The placeholder ACCOUNT_ID_PLACEHOLDER is in an ARN, AWS’ unique resource identifier. An ARN consists of several components that allow you to pinpoint exactly what you’re looking for, and one of these parts is the account id where the resource can be found.

Normally, we could automatically fill this with our own account id (!Sub "...${AWS_ACCOUNT_ID}..."), or enter another account id explicitly. But in practice, many people take the easier route, and put in a wildcard (*) there, so they won’t have to hardcode other account id’s or even look them up.

Formerly, this could have been fine. It was impossible to access this queue from another account, so you weren’t actually opening yourself up to anything. Including wildcards for ARNs is simply a way to work faster for many.

But with the addition of cross-account SQS queues, you’ve retroactively opened up the data on your queue for anyone in the world – as long as they know your account id and the name of your role. Both of those can end up in your repository along with your CloudFormation, and those might be (or become) public.

In this case, it’s possible the account id and role name leaked because we checked them in to a (public) repository. By treating values like account id’s and role names as secrets, you make it harder to exploit mistakes like these, but that is not enough – you would only be practicing security through obscurity.

A better way of working

There are several important lessons to learn from this.

First of all, AWS is an evolving platform. Using wildcards, any wildcards, especially in your IAM policies, can have unintended consequences in the future. As more features and permissions are being added, you retroactively open up your account to more and more actors.

Secondly, with AWS, we’re constantly working with services that are exposed to the internet. Gone are the days where we could assume our firewalls kept almost all malicious actors out of the door, and not care about security within the boundaries of our systems. We have talked about ways to prevent accidentally opening up resources in your AWS account in an earlier blog post as well.

By being explicit in your IAM policies, you can almost guarantee only those you intended can access your data. Sure, it will take a bit longer to set up your permissions, but it’s essential to protect your data. If you need to manually construct an ARN, either use pseudo-parameters such as AWS::AccountId in your CloudFormation templates, or pass them in through parameters.

When we work with AWS, we work with tools that help us set up complicated environments securely in a matter of minutes, as long as we keep the basic rules in mind. At Profit4Cloud, we write posts like this to make you aware of potential issues, so you can continue doing what you do best – creating cool things for your customers!

Blog Layout Test – Erik