Keeping AWS costs low with AWS Nuke

A common pattern in several companies using AWS services is having several distinct AWS accounts, partitioned not only by teams, but also by environments, such as develop, staging, production.

This can very easily explode your budget with not utilized resources. A classic example occurs when automated pipelines – think of terraform apply, or CI/CD procedures, etc – fail or time out, and all the resources created in the meanwhile are left behind.

Another frequent example happens in companies recently moving to the cloud. They create accounts for the sole purpose of familiarizing and educating developers on AWS and doing quick and dirty experiments. Understandably, after clicking around and creating multiple resources, it becomes hard to track exactly what was instantiated, and so unused zombie resources are left lingering around.


AWS Nuke to the rescue

There are several options for tools to assist you with cleaning up AWS environment, such as from aws-nuke from rebuy-de and cloud-nuke from gruntwork. From the documentation aws-nuke supports destroying many more AWS resources compared to cloud-nuke, which was my intended use case. However cloud-nuke is being developed with a broader scope, with support for Azure and GCP in mind. When this post was released, this still remained a declaration of intentions though.

AWS Nuke is quite easy and intuitive to work with. To install it:

aws-nuke supports AWS CLI credentials, as well as profiles, something typical when managing several AWS accounts.  Here is how you can run a first scan of the resources to be deleted:

aws-nuke --config config.yaml --profile 


When you run it for the first time you might stumble upon the following message:

aws-nuke version v2.7.0 – Fri Nov 23 10:28:30 UTC 2018 – 0b0806d56f85a329de6d1eedbf8559d46988a7f4

Error: The specified account doesn’t have an alias. For safety reasons you need to specify an account alias. Your production account should contain the term ‘prod’.

As explained here, the root cause might very likely be that you do not have a AWS account alias configured.

If things went as they should, you should be prompted with:

Do you really want to nuke the account with the ID 12345678910 and the alias ‘<your-aws-account-specific-profile>’?
Do you want to continue? Enter account alias to continue.


At this point two things are important to be mentioned. The first is is that depending how many resources should be removed, this might take a while. The second is that no resource will be actually deleted at this point. You should get a message similar to the following:

Scan complete: 436860 total, 436860 nukeable, 0 filtered. The above resources would be deleted with the supplied configuration. Provide –no-dry-run to actually destroy resources.

Thus, to conduct the irreversible destruction:

aws-nuke --config config.yaml --profile  --no-dry-run

Here is an example of a config.yaml template to remove all resources except a given IAM:

Alternatively you can also specify only some target resources to be removed. Here is an example:

That’s all for today. For more information about aws-nuke, have a look in the repository.



Integrating IAM user/roles with EKS

To be completely honest, this article spawns out of some troubleshooting frustration. So hopefully this will save others some headaches.

The scenario: after having configured an EKS cluster, I wanted to provide permissions for more IAM users. After creating a new IAM user with belonged to the target intended IAM groups, the following exceptions were thrown in the CLI:

kubectl get svc
error: the server doesn't have a resource type "svc"
kubectl get nodes
error: You must be logged in to the server (Unauthorized)


AWS profile config

First configure your local AWS profile. This is also useful if you want to test for different users and roles.


# aws configure --profile
# for example:
aws configure --profile dev

If this is your firts time, this will generate two files,

~/.aws/config and ~/.aws/credentials

It will simply append to them, which means that you can obviously edit the files manually as well if you prefer. The way you can alternate between these profiles in the CLI is:

#export AWS_PROFILE=
# for example:
export AWS_PROFILE=dev

Now before you move on to the next section, validate that you are referencing the correct user or role in your local aws configuration:

# aws --profile  sts get-caller-identity
# for example:
aws --profile dev sts get-caller-identity

"Account": "REDACTED",
"UserId": "REDACTED",
"Arn": "arn:aws:iam:::user/john.doe"


Validate AWS permissions

Validate that your user has the correct permissions, namely the following two are required:

# aws eks describe-cluster --name=
# for example:
aws eks describe-cluster --name=eks-dev

Add IAM users/roles to cluster config

If you managed to add worker nodes to your EKS cluster, then this documentation should be familiar already. You probably AWS documentation describes

kubectl apply -f aws-auth-cm.yaml

While troubleshooting I saw some people trying to use the clusters role in the “-r” part. However you can not assume a role used by the cluster, as this is a role reserved/trusted for instances. You need to create your own role, and add root account as trusted entity, and add permission for the user/group to assume it, for example as follows:

  "Version": "2012-10-17",
  "Statement": [
        "Effect": "Allow",
        "Principal": {
             "Service": "",
             "AWS": "arn:aws:iam:::user/john.doe"
        "Action": "sts:AssumeRole"


Kubernetes local config

then, generate a new kube configuration file. Note that the following command will create a new file in ~/.kube/config

aws --profile=dev eks update-kubeconfig --name esk-dev

AWS suggests isolating your configuration in a file with name “config-“. So, assuming our cluster name is “dev”, then:

export KUBECONFIG=~/.kube/config-eks-dev
aws --profile=dev eks update-kubeconfig --name esk-dev


This will then create a the config file in ~/.kube/config-eks-dev rather than ~/.kube/config

As described in AWS documentation, your kube configuration should be something similar to the following:

If you want to make sure you are using the correct configuration:

export KUBECONFIG=~/.kube/config-eks-dev
kubectl config current-context

This will print whatever the alias you gave in the config file.

Last but not least, update the new config file and add the profile used.

Last step is to confirm you have permissions:

export KUBECONFIG=~/.kube/config-eks-dev
kubectl auth can -i get pods
# Ideally you get "yes" as the answer.
kubectl get svc



To make sure you are not working in a environment with hidden environmental variables that you are not aware and may conflict, make sure you unset them as follows:


Also if you are getting as follows:

could not get token: AccessDenied: User arn:aws:iam:::user/john.doe is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam:::user/john.doe

Then it means you are specifying the -r flag in your kube/config file. This should be only used for roles.

Hopefully this short article was enough to unblock you, but in case not, here is a collection of further potential useful articles:

Getting Started with Spark (part 4) – Unit Testing

Alright quite a while ago (already counting years), I published a tutorial series focused on helping people getting started with Spark. Here is an outline of the previous posts:

In the meanwhile Spark has not decreased popularity, so I thought I continued updating the same series. In this post we cover an essential part of any ETL project, namely Unit testing.

For that I created a sample repository, which is meant to serve as boiler plate code for any new Python Spark project.

Continue reading “Getting Started with Spark (part 4) – Unit Testing”

Decrypting correctly parameters from AWS SSM

Today is yet short one, but ideally will already save a whole lot of headaches for some people.

Scenario: You have stored the contents of a string using AWS SSM parameter store (side note: if you are not using it yet, you should definitely have a look), but when retrieving it  decrypted via CLI, you notice that the string has new lines (‘\n’) substituted by spaces (‘ ‘).

In my case, I was storing a private SSH key encrypted to integrate with some Ansible scripts triggered via AWS CodePipeline + CodeBuild. CodeBuild makes it realy easy to access secrets stored in SSM store, however it was retrieving my key incorrectly, which in term domino-crashed my ansible scripts.

Here you can also confirm more people are facing this issue. After following the suggestion of using AWS SDK – in my case with python boto3 – it finally worked. So here is a gist to overwrite an AWS SSM parameter, and then retrieving it back:

Hope this helps!

Container orchestration in AWS: comparing ECS, Fargate and EKS

Before rushing into the new cool kid, namely AWS EKS, AWS hosted offering of Kubernetes, you might want to understand how it works underneath and compares to the already existing offerings. In this post we focus on distinguishing between the different AWS container orchestration solutions out there, namely AWS ECS, Fargate, and EKS, as well as comparing their pros and cons.


Before we dive into comparisons, let us summarize what each product is about.

ECS was the first container orchestration tool offering by AWS. It essentially consists of EC2 instances which have docker already installed, and run a Docker container which talks with AWS backend. Via ECS service you can either launch tasks – unmonitored containers suited usually for short lived operations – and services, containers which AWS monitors and guarantees to restart if they go down by any reason. Compared to Kubernetes, it is quite simpler, which has advantageous and disadvantages.

Fargate was the second service offering to come, and is intended to abstract all everything bellow the container (EC2 instances where they run) from you. In other words, a pure Container-as-a-Service, where you do not care where that container runs. Fargate followed two core technical advancements made in ECS: possibility to assign ENI directly and dedicated to a Container and integration of IAM on a container level. We will get into more detail on this later.

The following image sourced from AWS blog here illustrates the difference between ECS and Fargate services.


EKS is the latest offering, and still only available on some Regions. With EKS you can abstract some of the complexities of launching a Kubernetes Cluster, since AWS will now manage the Master Nodes – the control plane. Kubernetes is a much richer container orchestrator, providing features such as network overlay, allowing you to isolate container communication, and storage provisioning. Needless to say, it is also much more more complex to manage, with a bigger investment in terms of DevOps effort.

Like Kubernetes, you can also use kubectl to communicate with EKS cluster. You will need to configure the AWS IAM authenticator locally to communicate with your EKS cluster.

Continue reading “Container orchestration in AWS: comparing ECS, Fargate and EKS”

Versioning in data projects

Reproducibility is a pillar in science, and version control via git has been a blessing to it. For pure Software Engineering it works perfectly. However, machine learning projects are not just only about code, but rather also about the data. The same model trained with two distinct data sets can produce completely different results.

So it comes with no surprise when I stumble with csv files on git repos of data teams, as they struggle to keep track of code and metadata. However this cannot be done for today’s enormous datasets. I have seen several hacks to solve this problem, none of them bullet proof. This post is not about those hacks, rather about an open source solution for it: DVC.


Let us exemplify by using a kaggle challenge: predicting house prices with Advanced Regression Techniques

Continue reading “Versioning in data projects”

AWS Server-less data pipelines with Terraform to Redshift – Part 2

Alright, it’s time for the second post of our sequence focusing on AWS options to setup pipelines in a server-less fashion. The topics that we are covering throughout this series are:

In this post we complement the previous one, by providing infrastructure-as-code with Terraform for deployment purposes. We are strong believers of a DevOps approach also to Data Engineering, also known as “DataOps”. Thus we thought it would make perfect sense to share a sample Terraform module along with Python code.

To recap, so far we have Python code that, if triggered by a AWS event on a new S3 object, will connect to Redshift, and issue SQL Copy command statement to load that data into a given table. Next we are going to show how to configure this with Terraform code.

As usual, all the code for this post is available publicly in this github repository. In case you haven’t yet, you will need to install terraform in order follow along this post.

Continue reading “AWS Server-less data pipelines with Terraform to Redshift – Part 2”