Picking up on Diogo’s last post on how to obliterate all resources on your AWS Account, I thought it could also be useful to, instead, list all you have running.
Since I’m long overdue on a Go post, I’m going to share a one file app that uses the Go AWS SDK for to crawl each region for all taggable resources and pretty printing it on stdout, organised by Service type (e.g. EC2, ECS, ELB, etc.), Product Type (e.g. Instance, NAT, subnet, cluster, etc.).
The AWS SDK allows to retrieve all ARNs for taggable resources, so that’s all the info I’ll use for our little app.
Note: If you prefer jumping to full code code, please scroll until the end and read the running instructions before.
The main goal is to get structured information from the ARNs retrieved, so the first thing is to create a type that serves as a blue print for what I’m trying to achieve. Because I want to keep it simple, let’s call this type
Also, since we are taking care of the basics, we can also define the TraceableRegions that we want the app to crawl through.
Finally, to focus the objective, let’s also create a function that accepts a slice of
*SingleResource and will convert will print it out as a table to stdout:
Continue reading “List all your AWS resources with Go”
A common pattern in several companies using AWS services is having several distinct AWS accounts, partitioned not only by teams, but also by environments, such as develop, staging, production.
This can very easily explode your budget with not utilized resources. A classic example occurs when automated pipelines – think of terraform apply, or CI/CD procedures, etc – fail or time out, and all the resources created in the meanwhile are left behind.
Another frequent example happens in companies recently moving to the cloud. They create accounts for the sole purpose of familiarizing and educating developers on AWS and doing quick and dirty experiments. Understandably, after clicking around and creating multiple resources, it becomes hard to track exactly what was instantiated, and so unused zombie resources are left lingering around.
Continue reading “Keeping AWS costs low with AWS Nuke”
Today is yet short one, but ideally will already save a whole lot of headaches for some people.
Scenario: You have stored the contents of a string using AWS SSM parameter store (side note: if you are not using it yet, you should definitely have a look), but when retrieving it decrypted via CLI, you notice that the string has new lines (‘\n’) substituted by spaces (‘ ‘).
In my case, I was storing a private SSH key encrypted to integrate with some Ansible scripts triggered via AWS CodePipeline + CodeBuild. CodeBuild makes it realy easy to access secrets stored in SSM store, however it was retrieving my key incorrectly, which in term domino-crashed my ansible scripts.
Here you can also confirm more people are facing this issue. After following the suggestion of using AWS SDK – in my case with python boto3 – it finally worked. So here is a gist to overwrite an AWS SSM parameter, and then retrieving it back:
Hope this helps!
Alright, it’s time for the second post of our sequence focusing on AWS options to setup pipelines in a server-less fashion. The topics that we are covering throughout this series are:
In this post we complement the previous one, by providing infrastructure-as-code with Terraform for deployment purposes. We are strong believers of a DevOps approach also to Data Engineering, also known as “DataOps”. Thus we thought it would make perfect sense to share a sample Terraform module along with Python code.
To recap, so far we have Python code that, if triggered by a AWS event on a new S3 object, will connect to Redshift, and issue SQL Copy command statement to load that data into a given table. Next we are going to show how to configure this with Terraform code.
As usual, all the code for this post is available publicly in this github repository. In case you haven’t yet, you will need to install terraform in order follow along this post.
Continue reading “AWS Server-less data pipelines with Terraform to Redshift – Part 2”
This article is intended to be a quick and dirty snippet for anyone going to through the struggle of getting your ECS service, which might have one or more containers running the same App (being part of an Auto Scaling Group), with a Network Load Balancer (instead of the more common ELB or ALB).
ECS Service/Task Definition
Another particularity of this implementation is that I also decided to use the ECS task’s network mode as awsvpc. In the case that you are not acquainted with this new option, this means that:
- Your container will get its own network interface and its own IP address;
- The Host port and the Container port need to be the same, since there is not middleware managing port match between the two entities.
The cherry on top is that the ECS Service now has the option of automatically registering and deregistering LB targets by their IP address, which fits perfectly on the intention described.
Network Load Balancer
This post isn’t concretely about describing the technical details of what is a Network Load Balancer but about the caveats of using it in this scenario: because NLB is a layer 4 load balancer, you won’t be able to define Security Groups at the NLB level. Instead, you’ll have to make sure you make your tasks/containers secure by attaching the security groups to them – remember that with the awsvpc network mode, each container will get its own NIC.
As for the actual code snippet to support what I’m trying to achieve: Continue reading “Get AppScaled ECS Tasks served by AWS Network Load Balancer”
IoT isn’t a new term and has, actually, been one of the buzz words of the XXI century, right alongside Crypto Currency. The two of them are seen holding hand in hand in the good and the bad, from interesting crypto projects focused on IoT usage, or actual IoT devices being hacked to mine crypto currencies. But this article is actually about IoT: it seems the last few years have given us the perfect momentum between three main ingredients for its popularity: the exponential increase of network capable devices, the easily available processing power and, finally, the hungry-for-data dynamic we from most parties nowadays.
For someone working as a Data Engineer (or related) there isn’t there much of a more end-to-end project than one which goes from setting up an actual no-so-intelligence device, design the methods to connect these devices to a network and, still, go about designing all streaming/batch processing infrastructure needed to deal with all the data. It really is a huge challenge.
In the following lines, I’ll focus on how AWS IoT has come to help in the first and second challenge and show an example of how to emulate an actual IoT device with the help of docker and walk through some of the nice and easy features AWS IoT offers such as IoT Core, management, topics and rules as well as integration with other AWS services such as Kinesis, Firehose or Lambdas.
Continue reading “Getting started with AWS IoT and a dockerized device”
This post is the first of sequence of posts focusing on AWS options to setup pipelines in a serverless fashion. The topics that we all cover throughout the whole series are:
In this post we lean towards another strategy to setup data pipelines, namely event triggered. That is, rather than being scheduled to execute with a given frequency, our traditional pipeline code is executed immediately triggered by a given event. Our example consists of a demo scenario for immediately and automatically loading data that is stored in S3 into Redshift tutorial. Continue reading “AWS Server-less data pipelines with Terraform to Redshift – Part 1”