Scaling (legacy)
You are now on a GitGuardian legacy architecture page.
Looking for the GitGuardian new architecture Scaling page? Please visit the Scaling page.
For information on the new architecture, as well as determining whether you are using the new or the legacy GitGuardian architecture, explore the New GitGuardian Architecture page.
Application topology
GitGuardian application consists of several Kubernetes resources.
On KOTS-based installations, it is possible to configure replicas, CPU and memory requests/limits for main deployments/pods.
For detailed insights into deployment/pod names, types, and their usage, visit the GitGuardian Application Topology page.
Configure scaling settings
Navigate under Config > Advanced Options in the KOTS Admin Console, you will have access to the worker scaling options.
- Frontend: Scale
gitguardian-app
workers. - Worker replicas: Scale
gitguardian-worker
workers. - Scanning worker replicas: Scale
gitguardian-scanner
workers. - Email worker replicas: Scale
gitguardian-email
workers.
Note: Changing these values doesn't affect the rollout upgrade strategy.
Note: Workers are configured to spread across nodes if there are multiple nodes. If you have configured your cluster for high availability, do not use less than 2 workers of each type.
Scaling for historical scans of repositories up to 15GB in size
When you add a high number of sources, consider temporarily increasing the number pods for the time of the initial historical scan. Afterward, you can decrease those pods' replicas and resources.
When performing a historical scan, GitGuardian clones the git repository on the pod's ephemeral storage and will traverse all branches and commits in search of potential Policy Breaks. The full scan is done by a single Pod and can last from a few seconds to many hours depending on the repository size.
The more pods you'll add, the more historical scans can be done concurrently. When sizing your nodes, keep in mind that each Pod must have enough ephemeral storage and memory to run.
The following sizing has been tested for 7000 repositories up to 15GB with 16 pods:
Component | Required Capacity | Count |
---|---|---|
Compute nodes | 8 vCPU 64 GB RAM 50GB ephemeral disk space, 500 GB persistent disk space | 6 |
PostgreSQL Master | 16 vCPU 64 GB memory 300 GB disk space | 1 |
PostgreSQL Read Replica | 8 vCPU 32 GB memory 300 GB disk space | 1 |
scanner Pods | Memory request and limit: 16GB | 16 |
Scaling real-time scans
Real-time scans are triggered by Push
events sent by the VCS to GitGuardian. Those scan duration is often under a second and should always be under 3 seconds. But to be able to handle peaks of pushes, you may want to increase the count of worker
Pods that are processing real-time scans.
We successfully tested peaks of 2000 pushes per hour with 8 worker
Pods replicas, without changing default resources settings.