Deploy a private helm chart with terraform

Xing Du
3 min readJul 15, 2023

I set up a helm release (from a private ECR repository) using the terraform helm provider today.

This process involves many components to be configured correctly, and I decided to write down the process as a guide.

Debugging problems (yes, you will run into some) may be tricky, and I’m pretty happy with my approach to solving the only problem I ran into. Sharing it here as well since I failed to google anything really useful.

Context

  • ECR is set up in its own AWS account (account_id=<ecr_aws_account_id>)
  • other resources are provisioned in environment-specific AWS accounts(account_id=<{dev|stg|prd}_aws_account_id>)
  • I’m using a parent AWS account’s role to assume the correct roles for each resource.

Versions involved

pinning the versions I used. I don’t expect a huge discrepancy if more recent versions are used.

  • helm: 3.8.2
  • terraform: 1.4.4
  • terraform/helm: 2.10.1

Things to setup

ECR Repository

create the ECR repository with terraform if not done:(terraform-aws-ecr)

need to specify the ARNs of identities (IAM user, role, etc.) for read-only permission and read-write permission:

  • the identity used in the chart repo to push the chart requires read-write
  • the identity used in your k8s node requires read-only permission

Chart repo

  • ECR authentication: aws ecr get-login-password --region ${ECR_AWS_REGION} | ${REGISTRY_LOGIN_CMD} --username AWS --password-stdin ${ECR_AWS_ACCOUNT}.dkr.ecr.${ECR_AWS_REGION}.amazonaws.com, where
    ECR_AWS_REGION is the region of your destination ECR registry
    REGISTRY_LOGIN_CMD is:
    docker login for docker images
    helm registry login for helm charts
  • ECR_AWS_ACCOUNT is <ecr_aws_account_id>
  • helm package: package the chart into a semver-versioned .tgz file
  • helm push: push images to the destination ECR repository.

K8s terraform repo

  • add helm provider if not already done and add the registry block
data "aws_eks_cluster_auth" "this" {
name = "<cluster_id>"
}

data "aws_ecr_authorization_token" "token" {
registry_id = "<ecr_aws_account_id>"
}

provider "helm" {
kubernetes {
host = "<cluster_endpoint>"
cluster_ca_certificate = base64decode("<_cluster_certificate_authority_data>")
token = data.aws_eks_cluster_auth.this.token
}
registry {
url = "oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com"
username = data.aws_ecr_authorization_token.token.user_name
password = data.aws_ecr_authorization_token.token.password
}
}
  • define and configure helm_release resource
resource "helm_release" "my_release" {
name = "my_release"
namespace = "my_release"
create_namespace = true

repository = "oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com"
chart = "my_chart"
version = "1.2.3"
values = [
"${file("<path_to_values.yaml>")}"
]
}
  • run terraform apply and you'll get your release deployed to your k8s cluster

Debugging

This is the part that can be really time-consuming since the error message you get is usually general ("failed to download"):

@aareet said it very well here:

This error message comes from the helm package and is a catch-all error that is displayed when the chart fails to download for any reason. The real error is suppressed and debugging has to be enabled to surface it.

To debug this kind of problem, the direction is:

  • decouple components to identify where the problem is
  • use different options to narrow down the root cause

terraform helm_release vs helm

By translating the helm_release into a helm CLI command, you'll be able to identify where the problem is.

For example, the helm_release above can be translated to: helm install my_release oci://<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com/my_chart --namespace my_release --create-namespace -f <path_to_values.yaml> --version 1.2.3

You should try to use the same version of helm as what the helm provider uses and uses the same AWS role specified in your aws provider (not included in the code snippet)

If your issue is authentication/permission related: (e.g., “4xx response” error message)

  • create an output variable for the authorization_token
output "token" {
value = data.aws_ecr_authorization_token.token.authorization_token
sensitive = true
}
  • run terraform apply (expecting the same error you got) and access the output via terraform output --raw token
  • store a copy of that value to your local helm registry configuration:
# path (for helm@v3): ~/.config/helm/registry/config.json
{
"auths": {
"<ecr_aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com": {
"auth": "<copy_of_token_output>"
}
}
}
  • run the equivalent helm install command and see if you're able to pass through without an error

Need more context

Non-mutual-exclusive options:

  • run terraform with TF_LOG=DEBUG(environment variable)
  • run terraform with HELM_DEBUG=1(environment variable)
  • run helm command with --debug

Depending on the information from the output, various fixes can be put in place. We’ll rinse and repeat until all the errors are corrected.

The aws_ecr_authorization_token datasource needs explicit definition of region and provider , if the ECR lives in a different AWS account/region than what other resources are defined in.

Conclusion

We did a high-level walkthrough on how to set up a helm release using terraform and a private OCI repository, and some instructions on debugging errors you'll run into.

If you find this to be helpful, give it a clap and it would mean the world to me. Please share this with whoever needs this, and I’d appreciate it if you want to buy me a coffee

--

--

Xing Du

Minimalist. Game Developer. Software Engineer. DevOps enthusiast. Foodie. Gamer.