When talking with clients and colleagues, I’m always keen to suggest tools that may help make their lives easier. There is one tool I often recommend that many people have not heard of - the pre-commit framework. In this article we will take a look at this tool, how to use it and understand how it can help us.

The Challenge

When version controlling our code, we want to ensure that it is functional, secure, and aligned with coding and formatting standards. A common way of addressing this is to ensure checks take place as part of merge/pull request pipelines. Where code doesn’t comply, these checks can fail the pipeline, preventing it making its way through to trunk branches. This is a great approach (and is still something you’d want in place regardless), but does present two key challenges:

  • Speed - Waiting for a CI/CD agent to spin up, accept your job, perform tests and then fail wastes valuable time and results in a slower feedback.
  • Security - Let’s say you inadvertently commit a secret - by the time the pipeline detects this, the commit is already in place and in the remote origin. Removing secrets (and other sensitive values) from git history is possible, but is not completely straight forward.

Wouldn’t it be great if we could catch any problems before a commit actually takes place?


Pre-Commit Hooks

A pre-commit hook is a powerful feature that is natively built into Git. As the name suggests, it is an action that can be automatically executed before a commit takes place. Pre-commit hooks are one part of a larger “hook ecosystem” within git, which allows custom scripts to be triggered at various points in the Git workflow.

These hooks reside in the .git/hooks directory of every Git repository and can be written in any scripting language, as long as they’re executable by the system. The main types of git hooks are shown below, and can be either client-side or server-side (dependant on support within the platform):

  • Pre-commit - These run before a commit is created.
  • Prepare-commit-msg - These run before the commit message editor is fired up but after the default message is created.
  • Commit-msg - Used to validate the commit message.
  • Post-commit - These run after a commit is successfully created.
  • Pre-rebase - These run before you rebase anything.
  • Post-rewrite - These run after commands that replace commits, like git commit –amend and git rebase.
  • Pre-push - These run during git push, after the remote refs have been updated but before any objects have been transferred.
  • Post-checkout - These run after you checkout a branch.
  • Post-merge - These run after a successful merge command.

We are going to be focussing on the pre-commit hooks. Some key points on them:

  1. Built-in functionality - Pre-commit hooks are a standard part of Git, not a third-party add-on.
  2. Local execution - They can run on the developer’s local machine, though some platforms support hooks as well.
  3. Customisable - Developers can create, modify, or disable hooks as needed.
  4. Versatile - They can be used for various tasks like code linting, running tests, or checking for sensitive data.

While pre-commit hooks are powerful on their own, managing them across a team or multiple projects can be challenging. This is where frameworks like the the one we are discussing today come in. They offer a more structured and shareable approach to implementing these checks.


Pre-Commit Framework

The pre-commit framework is a Python tool that helps to simplify the management and execution of various Git hook scripts. It provides a structured way to define, share, and maintain these pre-commit hooks across projects and teams. Some reasons why I like the pre-commit framework:

  1. Ease of use - Pre-commit allows you to define hooks in a simple YAML configuration file, making them easy to set up and maintain, without having to necessarily write your own scripts.
  2. Language agnostic - It supports hooks written in any programming language, giving you flexibility in your tooling choices.
  3. Shareable configurations - The .pre-commit-config.yaml file (more on that shortly) can be version controlled and shared among team members, ensuring consistency across the project.
  4. Extensive ecosystem - There are lots of different pre-commit hooks available within the community allowing you to easily improve your code quality/security with minimal effort.
  5. Quick execution - Hooks run as you are creating local commits, saving you needing to wait for CI/CD agents to spin up and execute.
  6. Selective execution - You can choose to run all hooks or only specific ones (or none at all), giving you control over the validation process. Equally, this is why it is not a replacement for pipeline validation - something we will cover shortly.

By using the pre-commit framework, teams can standardise their pre-commit checks, catch issues early in the development process, and maintain high code quality standards with minimal friction.


Installing Pre-Commit

Installing pre-commit is a breeze. Just ensure that you have a supported Python version and pip installed, then simply execute:

  • pip install pre-commit

Once installed, pre-commit can be easily invoked with the pre-commit command:

pre-commit -V
pre-commit 3.8.0

Creating A Configuration File

Pre-commit relies on the presence of a specially named file to determine what actions to take. The file name is .pre-commit-config.yaml and it should reside within the root of your repository. Below we can see an example of such a file that we may want to use with our Terraform projects:

repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
  rev: v1.96.1
  hooks:
    - id: terraform_fmt
    - id: terraform_validate
    - id: terraform_checkov

In the configuration file above, we are specifying that we want to use three checks (terraform_fmt, terraform_validate and terraform_checkov) from the pre-commit hooks that Anton Babenko has created. For stability, I’ve pinned it to a specific git tag, but this could of course be a reference such as master/main depending on your needs.


Installing the Pre-Commit Checks

Once we’ve defined our configuration file, we still aren’t actually using the checks yet. For predominantly security reasons, we have to manually register these with the pre-commit tool. Why security reasons? Well imagine if someone was able to just place a malicious .pre-commit-config.yaml into your repository and all your clients executed it automatically - it wouldn’t be great!

This in itself is one of the downsides to pre-commit - although you may define a configuration file in your repository, you can’t guarantee that developers are actually going to install and use the functionality. This is why it is complementary to pipeline checks, and not a replacement. There are however, various ways this can be automated so that developers don’t have to remember to run the command, but they are typically at the trade off of security, so will depend on your organisation, tolerance to risk and what you want to achieve.

To install the checks, we simply have to run:

pre-commit install
pre-commit installed at .git/hooks/pre-commit

Depending on the checks you are using, you may require additional software to be installed. For example, with the ones I’m using in this example, it is not going to work if I don’t already have Terraform and Checkov installed.

Now that we’ve installed our pre-commit checks, let’s see them, in action.


Pre-Commit Usage

Once the configuration is registered with pre-commit install, there is nothing special you have to do - and this is the great thing about pre-commit hooks! Just by running a git commit command, you’re going to automatically be protected from your own mistakes!

Let’s say I create the following in my Terraform configuration file:

resource "azurerm_dns_a_record" "example" {
  name = "example"
  zone_name = azurerm_dns_zone.this.name
  resource_group_name = azurerm_resource_group.this.name
  ttl = 300
  target_resource_id = azurerm_static_web_app.this.id
}

Looks like valid code, but it hasn’t been nicely formatted with terraform fmt. Recall that fmt was one of the checks that we installed - what happens when we try to commit?

Pre Commit Fail

We notice a few interesting things here:

  1. Terraform fmt “failed” - We were expecting this. We deliberately included some unformatted Terraform code, so it is right this check fails. One interesting thing to note however is that it says “files were modified by this hook”. The Terraform fmt command will have actually fixed our indentation issues so we don’t have to. However, because the file has now modified again, we will have to manually run a new git add command to restage the changed file. Just keep this in mind.
  2. Checkov error - As mentioned, the checks you choose may have dependencies on other software. In this instance, I need checkov installed, but as you can see from the above, it isn’t.
  3. The commit did not take place - If we check the git log, the commit did not take place. When pre-commit checks are in use, the commit won’t occur should they fail. Exactly what we want - keep our commit history cleaner, and prevent hard to remove secrets from making their way in there (which we aren’t doing just yet).

Let’s restage our changed main.tf file, install checkov and try the commit again…

Pre-Commit Pass

Great - our checks passed and our commit took place!

Now let’s try adding an insecure resource configuration to see what the Checkov checks catch:

resource "azurerm_storage_account" "insecure_example" {
  name                          = "insecurestorageaccount"
  resource_group_name           = azurerm_resource_group.this.name
  location                      = azurerm_resource_group.this.location
  account_tier                  = "Standard"
  account_replication_type      = "LRS"
  public_network_access_enabled = true
  min_tls_version               = "TLS1_0"
}

Trying our commit again…

Pre-Commit Fail

This time we can see Terraform fmt and validate pass, but Checkov errors due to our insecurely configured Azure Storage Account. We aren’t going to get into the details of Checkov here (though likely will in a future article), but you can see how this is great for layering in fast checks and guardrails to your coding.


Adding Some More Checks

With our pre-commit config, we aren’t limited to just pulling checks from a single repository. Let’s expand on our previous configuration file to also include some checks for private keys and secrets:

repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
  rev: v1.96.1
  hooks:
    - id: terraform_fmt
    - id: terraform_validate
    - id: terraform_checkov
-   repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
    -   id: detect-secrets

Here we are adding a second entry to our “repos” list (denoted by the YAML dash - ). The name is pretty self-explanatory for this one - it will help try and identify secrets such as high-entropy strings likely to be passwords, or SSH private keys etc. Let’s generate a private key and try to commit it…

Pre Commit Fail

Perfect, it’s prevented us from doing something silly.


Conclusion

There are lots of different pre-commit checks available on the Internet, and you can always create your own if there is nothing that meets your needs. They are a powerful and versatile way of getting early feedback and preventing you from making mistakes. They are not a replacement for adequate checks in your automation pipelines, however they are an ideal complementary tool and one I’d recommend everyone take a look at.