If it ain't broke...
In the previous post, we saw how the Terraform lifecycle rule create_before_destroy
can help prevent a deadlock when renaming security groups. In this post, we will see how using the same lifecycle rule in the wrong place will create a problem.
To recap, when renaming a security group, you need to replace
resource "aws_security_group" "test_1" {
name = "test-1-new-name"
}
by
resource "aws_security_group" "test_1" {
name = "test-1-new-name"
lifecycle {
create_before_destroy = true
}
}
This ensures the following series of steps:
1. Create a security group with the new name.
2. Destroy the old security group.
3. Associate the new security group with the instance.
This made me think - using a lifecycle
rule seems like a good practice. Let me use it for aws_security_group_rule
resource also. That was a presumptous mistake. Let us see how.
We will replicate the same infrastructure setup scenario:
1. Create an EC2 instance (or any other resource which uses security groups).
2. Associate one or more security groups to the instance.
The above infra can be created through update-security-group-rule/v1/main.tf
$ terraform init
$ terraform apply
Sample terraform output
aws_instance_test_1 = i-02d50e0a62110bbc6
security_group_test_1 = sg-03cc308342b10ebe5
security_group_test_2 = sg-03c1cbe2eb0ace857
However, for the next step, instead of renaming the security group, we will add one more entry in the cidr_block
in our security_group_rule
i.e. we will update
resource "aws_security_group_rule" "sg_2_rule_1" {
from_port = 8080
protocol = "tcp"
to_port = 8080
security_group_id = aws_security_group.test_2.id
cidr_blocks = ["0.0.0.0/0"] # this line will be changed
lifecycle {
create_before_destroy = true
}
type = "ingress"
}
to
resource "aws_security_group_rule" "sg_2_rule_1" {
from_port = 8080
protocol = "tcp"
to_port = 8080
security_group_id = aws_security_group.test_2.id
cidr_blocks = ["0.0.0.0/0", "1.1.1.1/32"]
lifecycle {
create_before_destroy = true
}
type = "ingress"
}
$ terraform apply
# aws_security_group_rule.sg_2_rule_1 must be replaced
+/- resource "aws_security_group_rule" "sg_2_rule_1" {
~ cidr_blocks = [ # forces replacement
"0.0.0.0/0",
+ "1.1.1.1/32",
]
from_port = 8080
~ id = "sgrule-1489633736" -> (known after apply)
.
.
}
aws_security_group_rule.sg_2_rule_1: Creating...
Error: [WARN] A duplicate Security Group rule was found on (sg-03c1cbe2eb0ace857). This may be a side effect of a now-fixed Terraform issue causing two security groups with identical attributes but different source_security_group_ids to overwrite each other in the state. See https://github.com/hashicorp/terraform/pull/2376 for more information and instructions for recovery.
Error message: the specified rule "peer: 0.0.0.0/0, TCP, from port: 8080, to port: 8080, ALLOW" already exists
What happened?
1. Initially, the security group had the following rule associated with it:
direction | from_port | to_port | source | rule
ingress | 8080 | 8080 | 0.0.0.0/0 | allow
2. We tried creating a new rule which has the following entries:
direction | from_port | to_port | source | rule
ingress | 8080 | 8080 | 0.0.0.0/0 | allow
ingress | 8080 | 8080 | 1.1.1.1/32 | allow
3. Because of the lifecycle rule create_before_destroy
, Terraform is creating the step-2 rule first, which is having an entry
direction | from_port | to_port | source | rule
ingress | 8080 | 8080 | 0.0.0.0/0 | allow
common to both rules. A security group cannot have 2 entries having the exact same rule associated with it (try creating a duplicate entry in the AWS console). Hence it fails with the error
Error message: the specified rule "peer: 0.0.0.0/0, TCP, from port: 8080, to port: 8080, ALLOW" already exists
This can be fixed by, you guessed it - removing the lifecycle rule from the security_group_rule
block as per update-security-group-rule/v2/main.tf
aws_security_group_rule.sg_2_rule_1: Destroying... [id=sgrule-1489633736]
aws_security_group_rule.sg_2_rule_1: Destruction complete after 0s
aws_security_group_rule.sg_2_rule_1: Creating...
aws_security_group_rule.sg_2_rule_1: Creation complete after 1s [id=sgrule-2162410043]
Lessons learnt:
lifecycle rule -
create_before_destroy
is useful in theaws_security_group
block, but harmful in theaws_security_group_rule
block.If it ain't broke, don't fix it. Again.
This article is crossposted on Last9’s engineering blog here.