I set up a helm
release (from a private ECR
repository) using the terraform
helm
provider today.
This process involves many components to be configured correctly, and I decided to write down the process as a guide.
Debugging problems (yes, you will run into some) may be tricky, and I’m pretty happy with my approach to solving the only problem I ran into. Sharing it here as well since I failed to google anything really useful.
Context
ECR
is set up in its ownAWS
account (account_id=<ecr_aws_account_id>
)- other resources are provisioned in environment-specific
AWS
accounts(account_id=<{dev|stg|prd}_aws_account_id>
) - I’m using a parent AWS account’s role to assume the correct roles for each resource.
Versions involved
pinning the versions I used. I don’t expect a huge discrepancy if more recent versions are used.
helm
:3.8.2
terraform
:1.4.4
terraform/helm
:2.10.1
Things to setup
ECR Repository
create the ECR repository with terraform if not done:(terraform-aws-ecr)
need to specify the ARN
s of identities (IAM user, role, etc.) for read-only
permission and read-write
permission:
- the identity used in the chart repo to push the chart requires
read-write
- the identity used in your k8s node requires
read-only
permission
Chart repo
ECR
authentication:aws ecr get-login-password --region ${ECR_AWS_REGION} | ${REGISTRY_LOGIN_CMD} --username AWS --password-stdin ${ECR_AWS_ACCOUNT}.dkr.ecr.${ECR_AWS_REGION}.amazonaws.com
, where
—ECR_AWS_REGION
is the region of your destinationECR
registry
—REGISTRY_LOGIN_CMD
is:
—docker login
fordocker
images
—helm registry login
forhelm
chartsECR_AWS_ACCOUNT
is<ecr_aws_account_id>
helm package
: package the chart into asemver
-versioned.tgz
filehelm push
: push images to the destinationECR
repository.
K8s terraform
repo
- add
helm
provider if not already done and add theregistry
block
data "aws_eks_cluster_auth" "this" {
name = "<cluster_id>"
}
data "aws_ecr_authorization_token" "token" {
registry_id = "<ecr_aws_account_id>"
}
provider "helm" {
kubernetes {
host = "<cluster_endpoint>"
cluster_ca_certificate = base64decode("<_cluster_certificate_authority_data>")
token = data.aws_eks_cluster_auth.this.token
}
registry {
url = "oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com"
username = data.aws_ecr_authorization_token.token.user_name
password = data.aws_ecr_authorization_token.token.password
}
}
- define and configure
helm_release
resource
resource "helm_release" "my_release" {
name = "my_release"
namespace = "my_release"
create_namespace = true
repository = "oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com"
chart = "my_chart"
version = "1.2.3"
values = [
"${file("<path_to_values.yaml>")}"
]
}
- run
terraform apply
and you'll get your release deployed to yourk8s
cluster
Debugging
This is the part that can be really time-consuming since the error message you get is usually general ("failed to download"
):
@aareet said it very well here:
This error message comes from the helm package and is a catch-all error that is displayed when the chart fails to download for any reason. The real error is suppressed and debugging has to be enabled to surface it.
To debug this kind of problem, the direction is:
- decouple components to identify where the problem is
- use different options to narrow down the root cause
terraform
helm_release
vs helm
By translating the helm_release
into a helm
CLI command, you'll be able to identify where the problem is.
For example, the helm_release
above can be translated to: helm install my_release oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com/my_chart --namespace my_release --create-namespace -f <path_to_values.yaml> --version 1.2.3
You should try to use the same version of helm
as what the helm
provider uses and uses the same AWS role specified in your aws
provider (not included in the code snippet)
If your issue is authentication/permission related: (e.g., “4xx response” error message)
- create an
output
variable for theauthorization_token
output "token" {
value = data.aws_ecr_authorization_token.token.authorization_token
sensitive = true
}
- run
terraform apply
(expecting the same error you got) and access the output viaterraform output --raw token
- store a copy of that value to your local
helm
registry
configuration:
# path (for helm@v3): ~/.config/helm/registry/config.json
{
"auths": {
"<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com": {
"auth": "<copy_of_token_output>"
}
}
}
- run the equivalent
helm install
command and see if you're able to pass through without an error
Need more context
Non-mutual-exclusive options:
- run
terraform
withTF_LOG=DEBUG
(environment variable) - run
terraform
withHELM_DEBUG=1
(environment variable) - run
helm
command with--debug
Depending on the information from the output, various fixes can be put in place. We’ll rinse and repeat until all the errors are corrected.
The aws_ecr_authorization_token
datasource
needs explicit definition of region
and provider
, if the ECR
lives in a different AWS account/region than what other resources are defined in.
Conclusion
We did a high-level walkthrough on how to set up a helm
release using terraform
and a private OCI
repository, and some instructions on debugging errors you'll run into.
If you find this to be helpful, give it a clap and it would mean the world to me. Please share this with whoever needs this, and I’d appreciate it if you want to buy me a coffee