How To Generate JSON With Terraform Without Using Heredoc Strings

2023-01-084 min readDarren Johnson

This is something I’ve only recently discovered but thought it was worth sharing here.

I have a few configurations where I need to pass JSON to Terraform to process. The most used resource where I do this is azurerm_virtual_machine_extension. The example HashiCorp documentation still shows the use of Heredoc strings which is probably why I hadn’t picked up on it until now.

I recently discovered a HashiCorp article where they tell you that by using the jsonencode and yamlencode functions “Terraform can be responsible for guaranteeing valid JSON or YAML syntax”. This sounded good to me as JSON syntax is not very forgiving, and it is designed to be read and processed by machines (not humans). If you put a comma or a bracket in the wrong place, you will soon be in a whole world of pain. The same is true for YAML and indentations.

Let me show you an example on how I used to previously do this, and how I do it now.

Here is a block from a configuration I use to deploy an ‘Microsoft.Azure.Diagnostics’ extension. The JSON syntax requires dynamic values derived from other parts of the configuration, in this case it is from a data source for a storage account azurerm_storage_account.vm_diags.

  protected_settings = <<PROTECTED_SETTINGS
    {
      "storageAccountName": "${data.azurerm_storage_account.vm_diags.name}",
      "storageAccountKey": "${data.azurerm_storage_account.vm_diags.primary_access_key}"
    }
PROTECTED_SETTINGS

You can see that the JSON block uses the Interpolation Syntax to evaluate the dynamic data source expressions and convert the results into strings. All key value pairs must be written surrounded by double quotes, which mandates the use of the interpolation syntax and adds extra complexity.

Conversely, when using the jsonencode function you can simply use standard HCL syntax which no longer requires the interpolation syntax or double quotes which is simpler to read and interpret.

protected_settings = jsonencode({
  storageAccountName = data.azurerm_storage_account.vm_diags.name,
  storageAccountKey  = data.azurerm_storage_account.vm_diags.primary_access_key
})

I don’t let Terraform create YAML files for me as I like to control the order of the entries within the file for readability reasons. I also think the Terraform syntax required to handle the indentations and lists is more complicated than the YAML syntax, which for me, defeats the purpose of using Terraform to generate it.

Below is an edited example of an Azure DevOps Pipeline written in YAML to show the different indentation levels involved. You can see it is 20 lines long and is fairly simple to read and understand.

name: Run$(Rev:rr)
pool:
  name: Azure Pipelines
  vmImage: ubuntu-latest
trigger: none
variables:
  system.debug: false
stages:
  - stage: build_stage
    displayName: Build Stage
    jobs:
      - job: first_build_job
        displayName: First Build Job
        steps:
          - task: PublishPipelineArtifact@1
            displayName: Publish Pipeline Artifact
            inputs:
              targetPath: $(Build.ArtifactStagingDirectory)
              artifact: $(buildArtifactName)
              publishLocation: pipeline

Now compare this to the equivilent HCL syntax required to create a sample-pipeline.yaml file in the current folder. The content section of the file requires 32 lines of code and the syntax adds extra complexity.

resource "local_file" "sample_pipeline" {
  filename = "./sample-pipeline.yaml"
  content = yamlencode({
    "name" = "Run$(Rev:rr)"
    "pool" = {
      "name"    = "Azure Pipelines"
      "vmImage" = "ubuntu-latest"
    }
    "trigger" = "none"
    "variables" = {
      "system.debug" = false
    }
    "stages" = [
      {
        "stage"       = "build_stage"
        "displayName" = "Build Stage"
        "jobs" = [
          {
            "job"         = "first_build_job"
            "displayName" = "First Build Job"
            "steps" = [
              {
                "task"        = "PublishPipelineArtifact@1"
                "displayName" = "Publish Pipeline Artifact"
                "inputs" = {
                  "targetPath"      = "$(Build.ArtifactStagingDirectory)"
                  "artifact"        = "$(buildArtifactName)"
                  "publishLocation" = "pipeline"
                }
              },
            ]
          },
        ]
      },
    ]
  })
}

When rendering the YAML, Terraform also reorders the content alphabetically, resulting in a sample-pipeline.yaml file that looks different:

"name": "Run$(Rev:rr)"
"pool":
  "name": "Azure Pipelines"
  "vmImage": "ubuntu-latest"
"stages":
- "displayName": "Build Stage"
  "jobs":
  - "displayName": "First Build Job"
    "job": "first_build_job"
    "steps":
    - "displayName": "Publish Pipeline Artifact"
      "inputs":
        "artifact": "$(buildArtifactName)"
        "publishLocation": "pipeline"
        "targetPath": "$(Build.ArtifactStagingDirectory)"
      "task": "PublishPipelineArtifact@1"
  "stage": "build_stage"
"trigger": "none"
"variables":
  "system.debug": false

For this reason I prefer to author my YAML files in VSCode and then use an online service such as YAML Lint to check the validity of the YAML file. If you do use YAML Lint be sure to uncheck the Reformat (strips comments) so your file order doesn’t change.

Key Takeaway: Use the Terraform jsonencode function to generate JSON where required. But it’s not best suited for generating YAML.