Skip to main content

How can it be solved with automated secrets detection?

Code reviews fail at reducing the risks of exposure of secrets#

Code reviews are great overall for detecting logic flaws or maintaining certain good coding practices. But they are not adequate protection for detecting secrets, mostly for two reasons:

  • Reviews generally only consider the net difference between the current and proposed states. Not the entire history of changes. If a commit adds a secret and another one later deletes it, this has a zero net effect that is not of any interest to reviewers. But the vulnerability is there.

  • Reviewers prefer to focus on errors that cannot be automatically detected, like design flaws. As a general principle, security automation should be implemented wherever it can be, so that humans focus on where they bring the most value.

Detection algorithms do it better#

Detecting secrets in source code is like finding needles in a haystack: there are a lot more sticks than there are needles, and you don’t know how many needles might be in the haystack. In the case of secrets detection, you don’t even know what all the needles look like!

A highly-performant automated detection system will be able to achieve:

  • A low number of false alerts raised. We call this high precision. Precision answers the question: "What is the percentage of the secrets that you detect that are actual secrets?". This question is perfectly legitimate, especially in the context of security teams being overwhelmed with too many alerts.
  • A low number of secrets missed. This is what we call high recall. Considering that a single undetected credential can have a big impact for an organization, some organizations prefer to triage more false alerts but make sure they don’t miss a secret.

Balancing the equation to ensure that the algorithm captures as many secrets as possible without flagging too many false results is an intricate and extremely difficult challenge GitGuardian takes care of for its users. GitGuardian builds and maintains a secrets detection engine with more than 350 specific types of secrets covered in addition to support for generic and custom credentials.