Preventing Key Leaks In Git Commits

When building tools that authenticate against other APIs, more often than not I need to manage private keys and secrets. The challenge is that sometimes it’s very easy to forget the fact that the key is sitting somewhere in a configuration file, and it will be accidentally checked in to the repository. With the proliferation of tools like trufflehog, that’s generally not a position you want to be in.

A lot of services are being proactive about it, and when a leaked key is detected, it will be automatically revoked (notice how it someone attempted to use it within minutes of the leak). That said, this is still not super-common, and while the issue is addressed from multiple angles, the easiest solution to me was to figure out a way by which I don’t accidentally commit security tokens to begin with.

Lucky for me, I learned about a concept called Git attributes. These are essentially settings that can be set on a per-path basis in the source code. This is extremely helpful when one needs to manage check-out and commit strategies. Similarly, it’s extremely helpful when I need to filter content before checking it into the GitHub repository (or any Git repository, for that matter).

For this particular example, I have a .cfg file (structured like any INI file), that I am using from inside a Python application, that stores API keys for experimentation.

[Keys]
MyKey = 0X03919887491992D34D
MySecret = 0X043882894377S3CR37
MyToken = 70K3N93499993754010AD
MyTokenSecret = S3CR3770K3N48982773

Before committing the code, I want to always remove these keys from the file. I could write a pre-commit hook, or I could create a custom filter.

To create a filter, I opened the .git folder, and added a new entry in the config file:

[filter "securekeys"]
	clean = "gsed -i \"s/\\(MyKey *= *\\).*/\\1/;s/\\(MySecret *= *\\).*/\\1/;s/\\(MyToken *= *\\).*/\\1/;s/\\(AccessTokenSecret *= *\\).*/\\1/\" %f"

What this does is essentially tell git, that every time files are added to the commit (i.e. with git add), run the filter called securekeys. The clean filter is run when content is staged.

Within the clean definition, I am using gsed (I am on a Mac without the GNU sed) to replace strings that follow specific key definitions in the config file. You can use any other tool or script here that you might deem useful.

Last but not least, I went to .git/info and edited the attributes file to include this line:

*.cfg   filter=securekeys

This ensures that the securekeys filter is only executed for .cfg files that need to be cleaned. From now on, I don’t have to worry about accidentally committing my keys.

There is an important caveat to the effectiveness of this method, and that pertains to the frequency with which you want to re-specify the keys in the configuration file. When a command like git add . is executed, the keys will be entirely removed, which is a bit of a pain if you need the tokens for local development. To avoid this problem, I have a file that I .gitignore-d, that contains production tokens, that never sits in the repo. Whenever I need to resume local development, I just use the tokens from that file instead.

Be aware that tools like GitHub Desktop or Visual Studio Code will run git add . for you (notice how you always only commit and pull from those tools, never git add .) - so if you are running local tests, make sure to disable automatic file adds or close the tools in question, lest you want your tokens removed from the configuration file.