Infrastructure as Code: Everything You Always Wanted to Know but Were Afraid to Ask
When (not) to consider automated infrastructure
As our colleague likes to say Lukáš Klášterský, 'if you're new to the cloud, don't think big' - get to know it thoroughly first. On the other hand, if the cloud is already a standard for you, you run several larger applications here and everything "runs as it should", then maybe it's time to think about Infrastructure as Code (IaC).
So, what is the main goal of IaC? As the name suggests, the idea is to have the infrastructure defined in a similar way to the source code of the application. The result is an IaC blueprint (or artifact) that describes "what this part of the infrastructure should look like." To put it simply, we can say that each part of the infrastructure has its own unique blueprint that defines: "This is what my database should look like," or: "This is what my Redis Cluster looks like."
Of course, you can argue that each component of each of your applications is different, individual. In that case, IaC is probably not for you. The most interesting synergy effect across the entire environment is achieved when you have clearly defined architectural standards (and at the same time adhere to them), which you then implement using IaC.
However, if you don't have a standard or it is difficult to define one, there may be specific use-cases where IaC will make sense.
I'm getting ahead of myself: this example (simplified) shows what an IaC blueprint for a database (in an AWS environment, a blueprint defined by AWS CloudFormation) might look like:
Now we won't go into detail about what each part means, but I wanted to show you "what it really looks like" right at the beginning.
Why automate infrastructure or use cases
Well, we already know what IaC is, and we have a general idea of what it could be good for. Let's take a look at the specific uses of IaC.
For those of you who prefer brevity to detailed examples, the summary is roughly as follows:
When to Consider Infrastructure as a Code
- I have clearly defined infrastructure standards.
- I need to dynamically create individual application environments.
- There are frequent changes in the environment that I want to centrally manage.
When Infrastructure as a Code is not for me
- I'm just getting acquainted with the cloud, I have more basic experience with running a cloud environment.
- My infrastructure is primarily "static", without many changes.
- The application infrastructure is highly individual and it is not possible to define a general binding standard.
In general, we can say that Infrastructure as a Code is an integral part of cloud environment automation, and automation itself is one of the main motivations for using the cloud.
Use case 1: Dynamic environment
This scenario typically applies to development or test environments. Why should I run a development environment permanently (and pay for it) when development needs it "only when it is deploying and testing something"? Or maybe I have multiple development teams working on the same app. To avoid "collisions" between them, I want to be able to prepare an identical environment for each team (and then delete it again when it's no longer needed).
If you define such an environment using IaC blueprints, you can dynamically deploy it in its entirety and delete it again when you no longer need it. And all of this is completely automated within your CI/CD pipeline.
Use case 2: Infrastructure as an integral part of the application
This scenario supports one of the important aspects of cloud-native application development, which is parity between environments. Ideally, I want my test environment to be the same as the production environment (maybe smaller in terms of performance), but architecturally identical.
This approach makes it much easier for me to find any errors and the overall troubleshooting environment. The description of the infrastructure is therefore an integral part of the delivery of the application, and I can immediately deploy such an application to the target environment.
Use case 3: Automated infrastructure lifecycle
Over time, standards are changed, expanded, or otherwise modified. Thanks to IaC, I can easily make modifications or other changes to all infrastructure, thus managing the entire infrastructure lifecycle. From its creation, through all its changes, to its final cancellation.
For example, at the beginning, I defined that I wanted to have 14 daily backups available for all databases and that each database could be accessed by administrators from a specific network range. Over time, it was decided that it was necessary to maintain 30 backups from a business point of view, and at the same time, the organization was expanded to include another branch, whose employees also need access to all databases.
I can modify my original blueprint for the database in a simple way, incorporate these requirements into it, and then automatically modify all databases created based on this blueprint. In the same way, for example, I can remove all components created with this blueprint at any time.
Blueprints everywhere you look
As I indicated a moment ago, there are basically three basic categories of IaC blueprints:
- Infrastructural
- Configuring the environment (landing zone)
- Security
Blueprints for Application Infrastructure
These blueprints are then further divided into:
- Generic
- Application-specific
Under the generic blueprints of the application, we can imagine, for example, the already mentioned database. It is represented by a single IaC blueprint and all applications use only it. Another example can be the deployment of containerization infrastructure or a virtual server.
Application-specific blueprints are part of one specific application and are not used by another application. These can be, for example, application parameters stored in AWS System Manager Parameter Store or Azure AppConfig) or specific application secrets in AWS Secret Manager or Azure Vault.
Blueprints for configuring the environment (landing zone)
I mentioned these blueprints a while ago. They are designed to configure the entire cloud environment and are always deployed, regardless of what application is running within the environment.
This includes, for example, components such as Budgets (so that some mechanism for cost control and reporting is deployed in each environment), configuration of the backbone network (for example, standard routing tables to my Hub network, as described by Jakub Procházka in his article Untangling cloud networks) or it can be, for example, a serverless function responsible for forwarding security logs to a central SIEM environment.
Security Blueprints
This is a specific category of blueprints that are also deployed in every environment, but are the responsibility of the security department. The purpose of security blueprints is to define security policies that control or directly enforce to meet my security requirements (typically components such as AWS Config or Azure Policy are used).
As an example, I can mention, for example, compliance with the tagging policy, whether all components are really encrypted, whether the specific parameters of individual components are set in a specific way, etc.
You could argue that security blueprints are unnecessary, because all parameters are set "as they should be" thanks to application blueprints. This is true, but only at the moment of deploying such a blueprint. Over time, individual cloud resources may be modified (human error, unconscious change, etc.). The purpose of security blueprints is therefore the continuous control of the entire environment. We can therefore perceive them as "control blueprints".
How it all plays together
So, if you decide (and should) implement all three types of IaC blueprints, the result will look something like this:
First, individual blueprints are deployed to the environment for landing zone configurations, then security blueprints, and finally individual infrastructure blueprints.
This will result in a situation where:
- each environment (in this case, the AWS environment) is deployed according to a global standard,
- Colleagues from the security department know that all requirements for the compliance environment are implemented,
- The application itself is deployed with a clearly defined standard.
Blueprint lifecycle
We should definitely not forget that individual blueprints change over time and evolve as your internal architectural or security requirements change. It is therefore important to have a defined procedure for changing blueprints over time.
We treat them in a similar way to any other program code.
Typically, the resulting blueprints are stored in a central repository (such as AWS S3 or Azure Storage Account) from which they are deployed. This repository should not, under any circumstances, allow direct modification of these blueprints!
A typical scenario of controlled and audited blueprint modification is that its individual versions are stored in a central source-code repository (e.g. GIT). When a new version of the blueprint is committed, a CI/CD pipeline is launched, the first step of which is the validation of the changes made (review process).
If the review is successful, the new version of the blueprint is deployed to a test environment, where it is tested that "everything is as it should be," and only then is the new version of the blueprint finally made available in the central blueprint repository.
Desired state and why it is so important
Let's go a closer look at what a desired state is and why you should care. The desired state approach describes the "target state of the object", i.e. that my (for example) database should look like "so-and-so".
Thus, it uses a declarative approach. I declare "as it should be". Which is a different approach than what you may be used to – an imperative approach, where I define "what should be configured how".
Example of a declarative approach:
- My database should be named MyDb.
- The database should be 20 GB in size.
- The performance of the database is 2 CPU and 4 GB RAM.
- The database disk is to be encrypted.
Example of an imperative approach:
- Create a database named MyDb.
- Configure the MyDb database to be 20 GB in size.
- Configure the MyDb database to have 2 CPUs and 4 GB of RAM.
- Encrypt the MyDb database disk.
A huge advantage of declarative notation is that I have the desired state – i.e. information about how it should be. If I have information about the target state, I am able to easily check the current state against the target state at any time.
For example, in the beginning, the database was encrypted, but over time someone (for example, due to human error) decrypted it. If I know the current and target state, I am able to detect this situation, including detailed information about "what is different than it should be".
Example of using desired state
An example of the functionality of AWS CloudFormation Drift, a technology that allows us to detect deviations against the target state:
Here at first glance, I can see that the ManagementSecurityGroup source has been modified, which currently does not correspond to its target state. Based on this information, I have the opportunity to take corrective steps, if necessary.
AWS Cloud Formation Drift is one of the instruments falling under the category of Continuous Cloud Compliance, which is a slightly broader area dedicated to the continuous evaluation of defined rules. This includes a plethora of tools – both native, such as Azure Policy,AWS Config or Cloud Formation Drifts as well as external ones, such as the CheckPoint CloudGuard.
The topic of continuous cloud compliance is more for a separate article.
So, what tools can I use?
If you've read this far, congratulations! So, you've probably been intrigued by the concept of Infrastructure as a Code and now you're probably wondering how to actually implement it.
The first decision you need to make is whether you want to focus on just one specific cloud environment, or if you want to manage everything from on-premises across different cloud environments with one tool.
If your goal is to be really hybrid, our recommendation is to use the too lTerraform from the HashiCorp. This tool is extremely powerful, but from my point of view, its comprehensive deployment can be a relatively complicated task. On the other hand, it will offer you a truly hybrid approach to IaC across any environment.
If, on the other hand, you want to focus only on one specific cloud environment, I would recommend using the tools of your chosen cloud provider.
In Amazon Web Services, it's CloudFormation. An astute reader will have noticed that this article uses samples primarily from this tool, which I personally prefer.
What may be interesting is the possibility to use the associated Cloud Development Kit (CDK) tool to create CloudFormation blueprints. CDK will probably be used by those of you who are mainly involved in the development department in the cloud environment. CDK allows you to create CloudFormation blueprints using individual programming languages such as JavaScript, TypeScript, Python, Java, C#, and Go.
If your cloud environment is Azure, you will definitely use Azure Resource Manager (ARM) templates.
ARM is native and an integral part of Azure. If you're paying attention, you've probably noticed that Azure translates all your operations (performed in the Azure graphical console) related to the creation of any resource into an ARM template, which it then applies.
If you want to get inspired by what these "internal" ARM templates look like, you can try to create any resource and download this generated ARM template.
Closing thoughts on Infrastructure as Code
If you're seriously considering automating your cloud environment, Infrastructure as Code is definitely the direction you should take in any tool. And honestly, if you're thinking about how to use the cloud to the fullest in your environment, you should automatically think about automation.
At first glance, Infrastructure as Code may look complicated, but trust me, once you try to deploy several components into your environment and get familiar with the whole concept of IaC, you probably won't want to create your environment "manually" anymore. We will return to this topic in one of the next articles.
Need help?