Using feature flags and dynamic blocks in Terraform 0.12.x

With the advent of the new HCL 2 language in Terraform, I’ve been using the newly available methods to make the internal Terraform modules that my team has written more functional and less inter-dependent.

One of the issues I’ve been trying to resolve in the Terraform codebase is the dependency of one module to related resources in another module. In many scenarios, I want a security group I’m creating inside a module to include an ingress rule for a security group that was created in a different module, leading to ordering problems, and often resulting in failed terraform apply commands. Sometimes running the code twice will fix the problem, but to me that feels less than ideal. Modularization of Terraform code is important in my mind, as it really allows you to do less work and have better results, and bringing up new environments is easier when you don’t have to cut and paste and hand-edit large sections of boilerplate code. Better to push the boilerplate into a module for re-use!

An image showing a toggle switch with a flag

To make the modules more flexible, I decided to take a page out of the front-end development book, and use feature flags in my Terraform code. For anyone not familiar, a feature flag is a boolean setting that controls whether a particular piece of code is ‘enabled’ in a system. This can be handy if you have a new feature that you want to test as it allows a quick switch to turn the feature on and perhaps off again in the event of a problem. Without a central place for feature flags, it would be easy to roll the feature back in one module, but you might forget to roll it back in other modules, resulting in possible security issues, dependency problems between your modules, or failed terraform apply commands. For Terraform, the idea of a boolean switch is nothing new. Many professional modules have an ‘enabled’ variable that turns the entire module on or off, or sometimes a particular feature. For examples, check out the work of the fine folks over at Cloudposse, and their excellent “SweetOps” approach to Terraform. I didn’t want to pass a separate ‘enabled’ variable into each module for efficiency and safety, so I chose the approach of using a map of boolean values that I could pass as a variable into all of my modules.

In our Terraform code, we have a main.tf in each environment that contains the backend definition, provider, and a set of base AWS tags that will be applied to each instance. I decided to add a Terraform map of booleans here, like so:

variable "module_features" {
   type = map(bool)
   default = {
       feature_1_enabled = true
       feature_2_enabled = false
   ...
   }
}

I can now pass this variable into each Terraform module that I use in my infrastructure. This allows me to have feature flags for options in our environment that are set in one place, rather than having to pass discrete ‘feature_enabled’ input variables into each module. If there are interdependent security groups, or other resources, there is only one place to turn the feature on for all modules.

module "module_1" {
  source = "modules/module_1"
  module_features = var.module_features
  ...
}

Inside our modules, using the ternary operator, I can now control resource creation:

data "aws_security_group" "feature_1_group" {
   count = var.module_features["feature_1_enabled"] ? 1 : 0
   ...
}

Now that I have the groundwork laid for feature flags, I can turn my attention to creating security group rules. There are two ways of creating rules for security groups in Terraform: inline and discrete. This post is not the place to debate the relative pros and cons of each approach, and others have done that before. I’m using the inline method.

In order to add conditional rules with an inline approach, I need to use the new for and for_each methods of Terraform 0.12.x that were added in HCL2. Here’s the functional part of the code:

Taking this piece by piece:

...
elb_ingress_list = {
   feature_1 = var.module_features["feature_1_enabled"] ? {
     enabled         = true
     from_port       = 1234
     to_port         = 2234
     protocol        = "tcp"
     security_groups = compact(tolist(data.aws_security_group.feature-1-sg[*].id))
   } : local.disabled_rule
   ...
 }

Each ingress rule you want to specify goes into an object. The object key (feature_1) is not important. The ternary will insert the rule if var.module_features[“feature_1_enabled”] is set to true and will insert the disabled_rule if set to false. The disabled rule can’t just be null, it must be a valid security group with the same number of attributes — but in this case a security group that disallows all access.

Next, I’ll build the list of dynamic ingress rules using the for and for_each functionality:

dynamic "ingress" {
   for_each = [
     for i in local.elb_ingress_list :
       {
           enabled         = i.enabled
           from_port       = i.from_port
           to_port         = i.to_port
           protocol        = i.protocol
           security_groups = i.security_groups
       } if i.enabled == true
   ]

   content {
     from_port       = ingress.value.from_port
     to_port         = ingress.value.to_port
     protocol        = ingress.value.protocol
     security_groups = ingress.value.security_groups
   }
 }

You can read this from the inside out. Each item in the local.elb_ingress_list map is evaluated by the for expression, and a filter (if i.enabled == true) is applied to remove any rules that don’t have enabled = true. The content {} block then writes the content of each loop to a separate ingress rule using the for_each expression.

This all allows you to build the security group with inline rules but also allows flexible rulesets when modules have interdependencies.

Of course, you can do all this easily for discrete rules like so:

resource "aws_security_group_rule" "feature_1_ingress" {
   count = local.module_features["feature_1"] ? 1 : 0
   ....
}

In the end, I found the for/for_each method to be somewhat more complex but gives the benefits and flexibility of discrete rules and the pros of inline rules together.

When writing Terraform code in HCL 2, there are a lot of new methods and functionality to consider. It was important for me to weigh the possible added complexity of the loops and dynamic rules against the ease of use in the modules, as well as the difficulty of onboarding new developers into the codebase. Using a feature flag approach makes the modules much easier to use, with less chance of human error causing a security problem or failure to build, allowing developers to experiment, add features, and modify the codebase with less risk.