Scaling
Application topology
GitGuardian application consists of several Kubernetes resources. Here are key aspects based on the installation type:
KOTS-based Installations: Allows configuration of replicas, CPU, and memory requests/limits for main deployments/pods.
Helm-based Installations: Provides more comprehensive customization, including:
- Creation of new classes of workers.
- Customization of other resource types such as ephemeral storage, huge pages, etc.
- Setting nodeSelector, tolerations, and additional configurations.
For detailed insights into deployment/pod names, types, and their usage, visit the GitGuardian Application Topology page.
Scaling for GitGuardian: Historical Scans, Real-Time Scans, Public API and ML Secret Engine
When using GitGuardian for monitoring repositories, it's crucial to scale the resources appropriately for historical scans, real-time scans, and public API requests. This ensures efficient and timely processing regardless of the load. These recommendations are only for existing cluster installations.
General Guidelines
When you add a high number of sources, consider temporarily increasing the number of pods for the time of the initial historical scan. Afterward, you can decrease those pods' replicas and resources.
When performing a historical scan, GitGuardian clones the git repository on the pod's ephemeral storage and will traverse all branches and commits in search of potential secrets. The full scan is done by a single Pod and can last from a few seconds to many hours depending on the repository size. The more pods you'll add, the more historical scans can be done concurrently. When sizing your nodes, keep in mind that each Pod must have enough ephemeral storage and memory to run. To improve performance and reduce scan times, it is recommended to use SSD disks for the ephemeral storage.
For real-time scans, these are triggered by Push
events sent by the VCS to GitGuardian. These scans typically complete in under a second and should always be under 3 seconds. To handle peaks of pushes, you may want to increase the count of worker-worker
Pods that are processing real-time scans.
The public API, used mainly by ggshield, is deployed under the webapp-public_api
pod. This pod is essential for enabling interactions between ggshield and GitGuardian. To ensure the public API can handle the expected traffic, you may need to adjust the number of webapp-public_api
Pods.
The webapp-internal_api
pods handle internal requests for the Dashboard, while the webapp-internal_api_long
pods manage longer-running operations, ensuring reliable performance and preventing timeouts during extended tasks.
Adjust the number of pods and node capacity according to the size and number of repositories, expected volume of push events, and expected volume of API requests to ensure efficient and effective scanning and interactions.
Small
Core System Components
For up to 2000 repositories, handling up to 500 pushes per hour, and up to 1000 API requests per hour:
Component | Required Capacity | Count |
---|---|---|
Kubernetes compute nodes | 4 vCPU 16 GB memory 50 GB ephemeral disk space, 10 GB persistent disk space | 3 |
PostgreSQL Master | 4 vCPU 8 GB memory 200 GB disk space | 1 |
PostgreSQL Read Replica | 2 vCPU 4 GB memory 200 GB disk space | 1 |
Redis | 2 vCPU 2 GB memory 20 GB disk space | 1 |
Total | 20 vCPU 62 GB memory 150 GB ephemeral disk space, 450 GB persistent disk space | 6 |
If you plan to use global ephemeral storage, add 20 GB to the persistent disk space on each of your Kubernetes compute nodes.
Historical Scans (up to 5GB in size)
Component | Required Capacity | Count |
---|---|---|
worker-scanners Pods | Memory request and limit: 6 GB | 4 |
Real-Time Scans (up to 500 pushes/h)
Component | Required Capacity | Count |
---|---|---|
worker-worker Pods | Default resource settings | 2 |
Public API (up to 1k requests/h)
Component | Required Capacity | Count |
---|---|---|
webapp-public_api Pods | Default resource settings | 2 |
nginx (dashboard and API) Pods | Default resource settings | 2 |
Dashboard (up to 200 active users)
Component | Required Capacity | Count |
---|---|---|
webapp-internal_api Pods | Default resource settings | 2 |
webapp-internal_api_long Pods | Default resource settings | 2 |
Machine learning (up to 10k events/h)
Component | Required Capacity | Count |
---|---|---|
ml-secret-engine Pods | Default resource settings | 1 |
worker-ml-api-priority Pods | Default resource settings | 1 |
Medium
Core System Components
For up to 10000 repositories, handling up to 1000 pushes per hour, and up to 25000 API requests per hour:
Component | Required Capacity | Count |
---|---|---|
Kubernetes compute nodes | 8 vCPU 32 GB memory 50 GB ephemeral disk space, 10 GB persistent disk space | 5 |
PostgreSQL Master | 8 vCPU 32 GB memory 250 GB disk space | 1 |
PostgreSQL Read Replica | 4 vCPU 16 GB memory 250 GB disk space | 1 |
Redis | 4 vCPU 8 GB memory 40 GB disk space | 1 |
Total | 56 vCPU 216 GB memory 250 GB ephemeral disk space, 590 GB persistent disk space | 8 |
If you plan to use global ephemeral storage, add 120 GB to the persistent disk space on each of your Kubernetes compute nodes.
Historical Scans (up to 10GB in size)
Component | Required Capacity | Count |
---|---|---|
worker-scanners Pods | Memory request and limit: 11 GB | 12 |
Real-Time Scans (up to 1k pushes/h)
Component | Required Capacity | Count |
---|---|---|
worker-worker Pods | Default resource settings | 4 |
Public API (up to 25k requests/h)
Component | Required Capacity | Count |
---|---|---|
webapp-public_api Pods | Default resource settings | 4 |
nginx (dashboard and API) Pods | Default resource settings | 2 |
Dashboard (up to 500 active users)
Component | Required Capacity | Count |
---|---|---|
webapp-internal_api Pods | Default resource settings | 4 |
webapp-internal_api_long Pods | Default resource settings | 2 |
Machine learning (up to 10k events/h)
Component | Required Capacity | Count |
---|---|---|
ml-secret-engine Pods | Default resource settings | 1 |
worker-ml-api-priority Pods | Default resource settings | 2 |
Large
Core System Components
For up to 40000 repositories, handling up to 2000 pushes per hour, and up to 50000 API requests per hour:
Component | Required Capacity | Count |
---|---|---|
Kubernetes compute nodes | 8 vCPU 64 GB memory 50 GB ephemeral disk space, 10 GB persistent disk space | 7 |
PostgreSQL Master | 16 vCPU 64 GB memory 300 GB disk space | 1 |
PostgreSQL Read Replica | 8 vCPU 32 GB memory 300 GB disk space | 1 |
Redis | 8 vCPU 16 GB memory 100 GB disk space | 1 |
Total | 88 vCPU 560 GB memory 350 GB ephemeral disk space, 770 GB persistent disk space | 10 |
If you plan to use global ephemeral storage, add 160 GB to the persistent disk space on each of your Kubernetes compute nodes.
Historical Scans (up to 15GB in size)
Component | Required Capacity | Count |
---|---|---|
worker-scanners Pods | Memory request and limit: 16 GB | 16 |
Real-Time Scans (up to 2k pushes/h)
Component | Required Capacity | Count |
---|---|---|
worker-worker Pods | Default resource settings | 8 |
Public API (up to 50k requests/h)
Component | Required Capacity | Count |
---|---|---|
webapp-public_api Pods | Default resource settings | 6 |
nginx (dashboard and API) Pods | Default resource settings | 2 |
Dashboard (up to 1k active users)
Component | Required Capacity | Count |
---|---|---|
webapp-internal_api Pods | Default resource settings | 6 |
webapp-internal_api_long Pods | Default resource settings | 2 |
Machine learning (up to 20k events/h)
Component | Required Capacity | Count |
---|---|---|
ml-secret-engine Pods | Default resource settings | 2 |
worker-ml-api-priority Pods | Default resource settings | 4 |
Configure scaling settings
Autoscaling
As of version 2024.9.0, GitGuardian application now supports Kubernetes autoscaling 🚀.
Autoscaling allows for dynamic scaling of worker pods based on Celery task queue length as an external metric for scaling decisions, improving efficiency and performance while optimizing resource costs.
To enable autoscaling based on Celery queue lengths, you need first to enable application metrics following this guide, then, it is necessary to set up external metrics that Kubernetes can use to make scaling decisions. This involves configuring Prometheus adapter to expose the length of Celery queues as external metrics. To achieve this, you must:
- Install Prometheus adapter: It is required to expose external metrics in Kubernetes. You can install it using Helm. Ensure it is configured to connect to your Prometheus server.
- Configure Prometheus adapter to expose Celery queue lengths as external metrics. This is done by setting up a custom rule in the Prometheus Adapter configuration.
The following rule should be added to your Prometheus Adapter Helm values to expose Celery queue lengths:
rules:
external:
- seriesQuery: '{__name__="gim_celery_queue_length",queue_name!=""}'
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (queue_name)
resources:
namespaced: true
overrides:
namespace:
resource: namespace
If you use Machine Learning, you will also need this rule:
rules:
external:
- seriesQuery: '{__name__="bentoml_service_request_in_progress",exported_endpoint!=""}'
resources:
namespaced: false
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
Autoscaling with KEDA (Only for Helm)
As an alternative of Prometheus adapter, you can use the KEDA controller to enable autoscaling. You can install it using Helm. As a requirement, you need to enable application metrics following this guide.
You must configure you Helm values to allow KEDA to connect to your Prometheus Server:
autoscaling:
keda:
prometheus:
metadata:
serverAddress: http://<prometheus-host>:9090
# Optional. Custom headers to include in query
customHeaders: X-Client-Id=cid,X-Tenant-Id=tid,X-Organization-Id=oid
# Optional. Specify authentication mode (basic, bearer, tls)
authModes: bearer
# Optional. Specify TriggerAuthentication resource to use when authModes is specified.
authenticationRef:
name: keda-prom-creds
A ScaledObject
and an hpa
will be created in the GitGuardian namespace.
Autsocaling Behavior
The following behavior will be applied:
- Scaling Up: If the length of a Celery queue exceeds 10 tasks per current worker replica, the number of replicas will be increased, provided the current number of replicas is below the specified maximum limit.
- Scaling Down: If the number of tasks per current worker replica remains below 10 for a continuous period of 5 minutes, the number of replicas will be decreased, provided the current number of replicas is above the specified minimum limit.
Using KEDA, when the Celery queue is empty, the worker will transition to an idle state, resulting in the number of replicas being scaled down to zero.
KOTS-based installation
Esure that you update the Kubernetes Application RBAC by adding the patch
permission to the servicemonitors
resource.
Navigate under Config > Scaling in the KOTS Admin Console, you will have access to the worker scaling options.
- Front replicas: Scale
nginx
pods. - API replicas: Scale
api
pods. - Workers replicas: Scale
workers
pods (including thescanners
pods)
Changing these values doesn't affect the rollout upgrade strategy.
Workers are configured to spread across nodes if there are multiple nodes.
If you have configured your cluster for high availability, do not use less than 2 workers of each type.
For each worker, you can enable autoscaling by ticking the option Enable Horizontal Pod Autoscaling
, then you will be able to specify the minimum and the maximum replicas.
By default, on a fresh install, Productivity tools (such as Slack, Jira Cloud, Confluence, ... starting with ods
in the name) tasks are assigned to generic scanners and workers:
ods_scan
queue is assigned to generic scannersrealtime_ods
andrealtime_retry_ods
queues are assigned to generic workers
You have the option to use dedicated workers for these features if you need to scale up those tasks.
Example with productivity tool worker in KOTS Admin:
Helm-based installation
Customize Helm applications using your local-values.yaml
file, submitted with the helm
command.
Configure deployments with replicas
: use webapps.[name].replicas
for web pods, celeryWorkers.[name].replicas
for async workers and secretEngine.replicas
for Machine Learning Secret Engine. Additionally, set resources requests
and limits
as needed.
Example
migration:
# Set resources for pre-deploy and post-deploy jobs
resources:
limits:
cpu: 1000m
memory: 500Mi
front:
nginx:
# Set resources for nginx init containers
init:
resources:
limits:
cpu: 1000m
memory: 500Mi
replicas: 2
resources:
limits:
memory: 1Gi
webapps:
public_api:
replicas: 5
resources:
requests:
cpu: 200m
memory: 500Mi
limits:
memory: 4Gi
celeryWorkers:
scanners:
replicas: 8
resources:
requests:
cpu: 200m
memory: 4Gi
limits:
memory: 16Gi
secretEngine:
replicas: 2
See the values reference documentation for further details.
For optimal performance, consider scaling the following pods to a minimum of 2 replicas each: webapp-hook
, webapp-internal-api-long
, webapp-public-api
, worker-email
, worker-long-tasks
, and worker-worker
.
You can enable workers autoscaling by setting the following Helm values:
celeryWorkers:
worker:
autoscaling:
hpa:
enabled: true
keda:
enabled: false
minReplicas: 1
maxReplicas: 10
You can enable Machine Learning Secret Engine autoscaling by setting the following Helm values:
secretEngine:
autoscaling:
hpa:
enabled: true
keda:
enabled: false
minReplicas: 1
maxReplicas: 2
autoscaling.hpa.enabled
and autoscaling.keda.enabled
Helm parameters are mutualy exclusive, you must choose between hpa (using Prometheus adapter) and KEDA controller.
Additional tuning ephemeral storage
Only available for helm-based installations.
In certain scenarios, optimizing ephemeral storage configurations becomes essential for achieving better performance and stability, particularly for scanners
workers. This section outlines additional configurations for fine-tuning ephemeral storage, focusing on leveraging "On Demand" nodes with nvme disks and integrating Generic Ephemeral Inline Volumes.
"On Demand" nodes with nvme disks
In the following example, we specify that scanners
workers only use "On Demand" VMs with nvme disks and that pods'
ephemeral storage will use these disks
celeryWorkers:
scanners:
replicas: 8
localStoragePath: /nvme/disk # Used for pods ephemeral storage
nodeSelector: # Must run on "On Demand" nodes with nvme disks
eks.amazonaws.com/capacityType: ON_DEMAND
local-nvme-ready: 'true'
tolerations:
- key: worker-highdisk
operator: Equal
value: 'true'
effect: NoSchedule
resources:
requests:
cpu: 200m
memory: 16Gi
limits:
memory: 16Gi
Generic Ephemeral Inline Volumes
In the following example, we leverage Kubernetes' Generic Ephemeral Inline Volumes within Helm charts. This feature facilitates dynamic provisioning and reclamation of storage, particularly beneficial when dealing with small limits on Ephemeral storage. Note that it's supported starting Kubernetes 1.23 (learn more).
celeryWorkers:
scanners:
replicas: 8
resources:
requests:
cpu: 200m
memory: 4Gi
limits:
memory: 16Gi
# -- Worker ephemeral storage
ephemeralStorage:
enabled: true
size: 2Gi
Node Affinity Scheduling
Use the nodeSelector parameter in Helm values to schedule worker pods on specific nodes, ensuring they run in designated zones or meet specific criteria. Learn more
celeryWorkers:
long:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c
scanners:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c
worker:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c