By Brad Johnson, Lead DevOps Engineer
When considering infrastructure automation Terraform and Ansible are usually brought up. Both do some things really well, but also have limitations. Terraform is an infrastructure as code tool, whereas Ansible is a configuration management tool that can also do infrastructure as code. I’ve had people ask about how the tools compare and which one to use and when, so let’s explore these tools and talk about the benefits of each.
First, why would you use Terraform? The single most important reason is that Terraform, like Ansible, is platform agnostic. This means that if you have a hybrid or multi cloud application or service, you can use terraform to manage the infrastructure in a single repository. Cloud vendor specific solutions like AWS CloudFormation templates work well, but they are limited to using within the platform they are available in. With Terraform’s ability to support multiple providers, you can do things like managing the infrastructure code definition of on-premise and AWS/GCP/Azure Cloud VMs, load balancers, DNS, or network configuration in the same set of files. Using a single common configuration language means greater flexibility in transitioning to new environments and reducing vendor lock-in. Another reason you should consider using Terraform is that, unlike Ansible, it works on the principle of understanding the current vs desired state. This means that if you do something like deploy a VM via Terraform, then later delete that block of configuration, Terraform will delete the VM. So your Terraform code is declarative of your infrastructure. With Ansible you would need to write additional code to perform an operation similar to this, as Ansible is not aware of the state from previous runs. Another benefit of Terraform is that you can see what it will do before you run it by using the ‘terraform plan’ command.
However, if you already have a significant amount of infrastructure deployed, it can be time consuming to import your current environment to manage under Terraform. You can use it for new deployments without importing existing environment configurations, however it won’t be able to manage those existing resources. Terraform also stores the state of what was provisioned in a state file. This means that if there are multiple people working on the code then they must run it out of a single common location with the same state file. Cloud providers can use cloud storage buckets to store the state file. The ideal solution might be using a CI or Orchestration system to run the ‘terraform apply’ to deploy infrastructure changes, and gating the process via approvals in ITSM. It is critical to ensure actual changes are applied from a single source of truth, like a master git branch. Also, while Terraform is extensible with custom providers ,you will need to write them in Go, which is not yet as widely used as Python.
Now let’s look at why Ansible. The best thing about Ansible is that it can handle a wide variety of configuration and deployment tasks using standard modules and it’s easily extensible with Python. You can deploy a VM, use templates in case of custom configuration files, communicate with REST APIs, interact with git repos, and easily configure Linux or custom software all using already available standard modules. Building your own custom Ansible modules, which typically isn’t needed given the exhaustive Ansible library, requires minimal programming effort. An example ‘hello world’ module only requires 4 lines of Python code. Drop the code in a ‘library’ directory next to your playbook and you’re ready to use it. Ansible also comes with ‘ansible-vault’ which provides a way to store sensitive variables in encrypted yaml files in your playbook repo, which can be decrypted at runtime using a vault password. Because of these features, you can easily implement a wide variety of use cases using Ansible to achieve configuration as code. Some example cases we’ve used Ansible for include deployment of Linux OS hardening changes to meet security standard compliance, configuring Apache Tomcat and Oracle Weblogic as part of application server deployment, integrating with ITSM (IT Service Management) and CMDB (Configuration Management Database) platforms, and interacting with silent installers and CLIs using Keyva built custom modules like one for Python Pexpect.
Now, given that Ansible does not store state of the resources, you will need to write playbooks to handle removal of resources. Meaning, even if you deployed something and Ansible made sure it was ‘present’, to remove it you would usually need to run the same function with the named resource as ‘absent’. For simple things like removing a file, this is easy and you just need to remove the code after it is run once everywhere. For more complex use cases, you can get around this limitation by writing playbooks in a way that queries existing resources into variable lists, compares to what is in Ansible, then removes the items that do not match. However, this would take additional time, is more complex, and does not account for any changes that were made on target resources manually. From a configuration, compliance and remediation standpoint, this may actually be desirable for some organizations.
What’s great about both tools is that they can work with each other. There’s no reason to believe that one tool needs to own the whole process. Given their differences in scope, while they can do similar things, they are in no way replacements for the full functionality of the other. Terraform can be set up to run Ansible on a host after provisioning to do the configuration of that host. Likewise, Ansible can use the Terraform module to plan or apply a Terraform project as a step within a playbook. The Ansible module for Terraform also returns the outputs from Terraform as variables that Ansible can consume and use for further action. When designing and implementing infrastructure-as-code in your environment, it is important to consider which tool is best suited for each part of the task. It is also imperative to consider combining Terraform with Ansible when deploying infrastructure. If you need help getting started or advice on best practices around implementing infrastructure-as-code, please reach out to [email protected].
Brad is an expert in automation using Ansible, Python and pexpect to develop custom solutions and automate the things that “can’t be automated”. Prior to Keyva, Brad worked at Cray R&D for 6 years and led automation efforts across their XC supercomputer development environment. Brad has a passion for learning new technology, technical problem solving and helping others.
Like what you read? Follow Brad on LinkedIn at: https://www.linkedin.com/in/bradejohnson/