Scaling
Looking for the GitGuardian legacy application Scaling page? Please visit the Scaling (legacy) page.
For information on the new application, as well as determining whether you are using the New or Legacy GitGuardian application, explore the New GitGuardian Architecture page.
Application topology
GitGuardian application consists of several Kubernetes resources.
Helm-based installation facilitates the configuration of all deployments, while also offering enhanced customization options for the following:
- Creation of new classes of workers
- Customization of other resource types such as ephemeral storage, huge pages, etc.
- Provision of nodeSelector, toleration, and more
Kind | Deployment Name | Usage |
---|---|---|
Front | nginx | Dashboard Frontend and proxy for backend |
Backend | app_exporter | (optional) Open Metrics Exporter for applicative metrics |
Backend | hook | VCS webhooks events receiver |
Backend | internal_api | Backend for the Dashboard (previously gitguardian-app in legacy) |
Backend | internal_api_long | Backend for the Dashboard (no timeout) |
Backend | public_api | Public API and GGshield scans |
Scheduler | beat | Celery Beat task scheduler |
Worker | email | Workers for queues: email, notifier |
Worker | long | Workers for long tasks: check/install health, asynchronous cleanup tasks, ... |
Worker | scanners | Workers for historical scans |
Worker | worker | Workers for queues: celery (default), check_run, realtime, realtime_retry |
Job | pre-deploy | Pre-deployment job performing database migrations |
Job | post-deploy | Post-deployment job performing long data migrations |
Configure scaling settings
Each deployment can be configured using the replicas
property. For web pods, you will use the webapps.[name].replicas
property, and for async workers, you will use the celeryWorkers.[name].replicas
.
You can also configure resources
requests and limits.
Example
migration:
# Set resources for pre-deploy and post-deploy jobs
resources:
limits:
cpu: 1000m
memory: 500Mi
front:
nginx:
# Set resources for nginx init containers
init:
resources:
limits:
cpu: 1000m
memory: 500Mi
replicas: 2
resources:
limits:
memory: 1Gi
webapps:
public_api:
replicas: 5
resources:
requests:
cpu: 200m
memory: 500Mi
limits:
memory: 4Gi
celeryWorkers:
scanner:
replicas: 8
resources:
requests:
cpu: 200m
memory: 4Gi
limits:
memory: 16Gi
Scaling for historical scans of repositories up to 15GB in size
When you add a high number of sources, consider temporarily increasing the number pods for the time of the initial historical scan. Afterward, you can decrease those pods' replicas and resources.
When performing a historical scan, GitGuardian clones the git repository on the pod's ephemeral storage and will traverse all branches and commits in search of potential Policy Breaks. The full scan is done by a single Pod and can last from a few seconds to many hours depending on the repository size.
The more pods you'll add, the more historical scans can be done concurrently. When sizing your nodes, keep in mind that each Pod must have enough ephemeral storage and memory to run.
The following sizing has been tested for 7000 repositories up to 15GB with 16 pods:
Component | Required Capacity | Count |
---|---|---|
Compute nodes | 8 vCPU 64 GB RAM 50GB ephemeral disk space, 500 GB persistent disk space | 6 |
PostgreSQL Master | 16 vCPU 64 GB memory 300 GB disk space | 1 |
PostgreSQL Read Replica | 8 vCPU 32 GB memory 300 GB disk space | 1 |
scanner Pods | Memory request and limit: 16GB | 16 |
Additional tuning
On Helm-based installations, additional configuration of scanner
workers can bring more performance and stability.
In the following example, we specify that scanner
workers only use "On Demand" VMs with nvme disks and that pods'
ephemeral storage will use these disks
celeryWorkers:
scanners:
replicas: 16
localStoragePath: /nvme/disk # Used for pods ephemeral storage
nodeSelector: # Must run on "On Demand" nodes with nvme disks
eks.amazonaws.com/capacityType: ON_DEMAND
local-nvme-ready: 'true'
tolerations:
- key: worker-highdisk
operator: Equal
value: 'true'
effect: NoSchedule
resources:
requests:
cpu: 200m
memory: 16Gi
limits:
memory: 16Gi
Scaling real-time scans
Real-time scans are triggered by Push
events sent by the VCS to GitGuardian. Those scan duration is often under a second and should always be under 3 seconds. But to be able to handle peaks of pushes, you may want to increase the count of worker
Pods that are processing real-time scans.
We successfully tested peaks of 2000 pushes per hour with 8 worker
Pods replicas, without changing default resources settings.