Skip to main content

Scaling

Application topology

GitGuardian application consists of several Kubernetes resources. Here are key aspects based on the installation type:

KOTS-based Installations: Allows configuration of replicas, CPU, and memory requests/limits for main deployments/pods.

Helm-based Installations: Provides more comprehensive customization, including:

  • Creation of new classes of workers.
  • Customization of other resource types such as ephemeral storage, huge pages, etc.
  • Setting nodeSelector, tolerations, and additional configurations.

For detailed insights into deployment/pod names, types, and their usage, visit the GitGuardian Application Topology page.

Scaling for GitGuardian: Historical Scans, Real-Time Scans, Public API and ML Secret Engine

When using GitGuardian for monitoring repositories, it's crucial to scale the resources appropriately for historical scans, real-time scans, and public API requests. This ensures efficient and timely processing regardless of the load. These recommendations are only for existing cluster installations.

General Guidelines

Performing your first Historical Scans

When you add a high number of sources, consider temporarily increasing the number of pods for the time of the initial historical scan. Afterward, you can decrease those pods' replicas and resources.

When performing a historical scan, GitGuardian clones the git repository on the pod's ephemeral storage and will traverse all branches and commits in search of potential secrets. The full scan is done by a single Pod and can last from a few seconds to many hours depending on the repository size. The more pods you'll add, the more historical scans can be done concurrently. When sizing your nodes, keep in mind that each Pod must have enough ephemeral storage and memory to run. To improve performance and reduce scan times, it is recommended to use SSD disks for the ephemeral storage.

For real-time scans, these are triggered by Push events sent by the VCS to GitGuardian. These scans typically complete in under a second and should always be under 3 seconds. To handle peaks of pushes, you may want to increase the count of worker-worker Pods that are processing real-time scans.

The public API, used mainly by ggshield, is deployed under the webapp-public_api pod. This pod is essential for enabling interactions between ggshield and GitGuardian. To ensure the public API can handle the expected traffic, you may need to adjust the number of webapp-public_api Pods.

The webapp-internal_api pods handle internal requests for the Dashboard, while the webapp-internal_api_long pods manage longer-running operations, ensuring reliable performance and preventing timeouts during extended tasks.

Adjust the number of pods and node capacity according to the size and number of repositories, expected volume of push events, and expected volume of API requests to ensure efficient and effective scanning and interactions.

Small

Core System Components

For up to 2000 repositories, handling up to 500 pushes per hour, and up to 1000 API requests per hour:

ComponentRequired CapacityCount
Kubernetes compute nodes4 vCPU
16 GB memory
50 GB ephemeral disk space, 10 GB persistent disk space
3
PostgreSQL Master4 vCPU
8 GB memory
200 GB disk space
1
PostgreSQL Read Replica2 vCPU
4 GB memory
200 GB disk space
1
Redis2 vCPU
2 GB memory
20 GB disk space
1
Total20 vCPU
62 GB memory
150 GB ephemeral disk space, 450 GB persistent disk space
6

If you plan to use global ephemeral storage, add 20 GB to the persistent disk space on each of your Kubernetes compute nodes.

Historical Scans (up to 5GB in size)

ComponentRequired CapacityCount
worker-scanners PodsMemory request and limit: 6 GB4

Real-Time Scans (up to 500 pushes/h)

ComponentRequired CapacityCount
worker-worker PodsDefault resource settings2

Public API (up to 1k requests/h)

ComponentRequired CapacityCount
webapp-public_api PodsDefault resource settings2
nginx (dashboard and API) PodsDefault resource settings2

Dashboard (up to 200 active users)

ComponentRequired CapacityCount
webapp-internal_api PodsDefault resource settings2
webapp-internal_api_long PodsDefault resource settings2

Machine learning (up to 10k events/h)

ComponentRequired CapacityCount
ml-secret-engine PodsDefault resource settings1
worker-ml-api-priority PodsDefault resource settings1

Medium

Core System Components

For up to 10000 repositories, handling up to 1000 pushes per hour, and up to 25000 API requests per hour:

ComponentRequired CapacityCount
Kubernetes compute nodes8 vCPU
32 GB memory
50 GB ephemeral disk space, 10 GB persistent disk space
5
PostgreSQL Master8 vCPU
32 GB memory
250 GB disk space
1
PostgreSQL Read Replica4 vCPU
16 GB memory
250 GB disk space
1
Redis4 vCPU
8 GB memory
40 GB disk space
1
Total56 vCPU
216 GB memory
250 GB ephemeral disk space, 590 GB persistent disk space
8

If you plan to use global ephemeral storage, add 120 GB to the persistent disk space on each of your Kubernetes compute nodes.

Historical Scans (up to 10GB in size)

ComponentRequired CapacityCount
worker-scanners PodsMemory request and limit: 11 GB12

Real-Time Scans (up to 1k pushes/h)

ComponentRequired CapacityCount
worker-worker PodsDefault resource settings4

Public API (up to 25k requests/h)

ComponentRequired CapacityCount
webapp-public_api PodsDefault resource settings4
nginx (dashboard and API) PodsDefault resource settings2

Dashboard (up to 500 active users)

ComponentRequired CapacityCount
webapp-internal_api PodsDefault resource settings4
webapp-internal_api_long PodsDefault resource settings2

Machine learning (up to 10k events/h)

ComponentRequired CapacityCount
ml-secret-engine PodsDefault resource settings1
worker-ml-api-priority PodsDefault resource settings2

Large

Core System Components

For up to 40000 repositories, handling up to 2000 pushes per hour, and up to 50000 API requests per hour:

ComponentRequired CapacityCount
Kubernetes compute nodes8 vCPU
64 GB memory
50 GB ephemeral disk space, 10 GB persistent disk space
7
PostgreSQL Master16 vCPU
64 GB memory
300 GB disk space
1
PostgreSQL Read Replica8 vCPU
32 GB memory
300 GB disk space
1
Redis8 vCPU
16 GB memory
100 GB disk space
1
Total88 vCPU
560 GB memory
350 GB ephemeral disk space, 770 GB persistent disk space
10

If you plan to use global ephemeral storage, add 160 GB to the persistent disk space on each of your Kubernetes compute nodes.

Historical Scans (up to 15GB in size)

ComponentRequired CapacityCount
worker-scanners PodsMemory request and limit: 16 GB16

Real-Time Scans (up to 2k pushes/h)

ComponentRequired CapacityCount
worker-worker PodsDefault resource settings8

Public API (up to 50k requests/h)

ComponentRequired CapacityCount
webapp-public_api PodsDefault resource settings6
nginx (dashboard and API) PodsDefault resource settings2

Dashboard (up to 1k active users)

ComponentRequired CapacityCount
webapp-internal_api PodsDefault resource settings6
webapp-internal_api_long PodsDefault resource settings2

Machine learning (up to 20k events/h)

ComponentRequired CapacityCount
ml-secret-engine PodsDefault resource settings2
worker-ml-api-priority PodsDefault resource settings4

Configure scaling settings

Autoscaling

info

As of version 2024.9.0, GitGuardian application now supports Kubernetes autoscaling 🚀.

Autoscaling allows for dynamic scaling of worker pods based on Celery task queue length as an external metric for scaling decisions, improving efficiency and performance while optimizing resource costs.

To enable autoscaling based on Celery queue lengths, you need first to enable application metrics following this guide, then, it is necessary to set up external metrics that Kubernetes can use to make scaling decisions. This involves configuring Prometheus adapter to expose the length of Celery queues as external metrics. To achieve this, you must:

  1. Install Prometheus adapter: It is required to expose external metrics in Kubernetes. You can install it using Helm. Ensure it is configured to connect to your Prometheus server.
  2. Configure Prometheus adapter to expose Celery queue lengths as external metrics. This is done by setting up a custom rule in the Prometheus Adapter configuration.

The following rule should be added to your Prometheus Adapter Helm values to expose Celery queue lengths:

rules:
external:
- seriesQuery: '{__name__="gim_celery_queue_length",queue_name!=""}'
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (queue_name)
resources:
namespaced: true
overrides:
namespace:
resource: namespace

If you use Machine Learning, you will also need this rule:

rules:
external:
- seriesQuery: '{__name__="bentoml_service_request_in_progress",exported_endpoint!=""}'
resources:
namespaced: false
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)

Autoscaling with KEDA (Only for Helm)

As an alternative of Prometheus adapter, you can use the KEDA controller to enable autoscaling. You can install it using Helm. As a requirement, you need to enable application metrics following this guide.

You must configure you Helm values to allow KEDA to connect to your Prometheus Server:

autoscaling:
keda:
prometheus:
metadata:
serverAddress: http://<prometheus-host>:9090
# Optional. Custom headers to include in query
customHeaders: X-Client-Id=cid,X-Tenant-Id=tid,X-Organization-Id=oid
# Optional. Specify authentication mode (basic, bearer, tls)
authModes: bearer
# Optional. Specify TriggerAuthentication resource to use when authModes is specified.
authenticationRef:
name: keda-prom-creds

A ScaledObject and an hpa will be created in the GitGuardian namespace.

Autsocaling Behavior

The following behavior will be applied:

  • Scaling Up: If the length of a Celery queue exceeds 10 tasks per current worker replica, the number of replicas will be increased, provided the current number of replicas is below the specified maximum limit.
  • Scaling Down: If the number of tasks per current worker replica remains below 10 for a continuous period of 5 minutes, the number of replicas will be decreased, provided the current number of replicas is above the specified minimum limit.

HPA behavior

info

Using KEDA, when the Celery queue is empty, the worker will transition to an idle state, resulting in the number of replicas being scaled down to zero.

KOTS-based installation

caution

Esure that you update the Kubernetes Application RBAC by adding the patch permission to the servicemonitors resource.

Navigate under Config > Scaling in the KOTS Admin Console, you will have access to the worker scaling options.

  • Front replicas: Scale nginx pods.
  • API replicas: Scale api pods.
  • Workers replicas: Scale workers pods (including the scanners pods)
info

Changing these values doesn't affect the rollout upgrade strategy.
Workers are configured to spread across nodes if there are multiple nodes. If you have configured your cluster for high availability, do not use less than 2 workers of each type.

For each worker, you can enable autoscaling by ticking the option Enable Horizontal Pod Autoscaling, then you will be able to specify the minimum and the maximum replicas. Worker Autoscaling configuration

By default, on a fresh install, Productivity tools (such as Slack, Jira Cloud, Confluence, ... starting with ods in the name) tasks are assigned to generic scanners and workers:

  • ods_scan queue is assigned to generic scanners
  • realtime_ods and realtime_retry_ods queues are assigned to generic workers

You have the option to use dedicated workers for these features if you need to scale up those tasks.

Example with productivity tool worker in KOTS Admin: ODS worker

Helm-based installation

Customize Helm applications using your local-values.yaml file, submitted with the helm command.

Configure deployments with replicas: use webapps.[name].replicas for web pods, celeryWorkers.[name].replicas for async workers and secretEngine.replicas for Machine Learning Secret Engine. Additionally, set resources requests and limits as needed.

Example

migration:
# Set resources for pre-deploy and post-deploy jobs
resources:
limits:
cpu: 1000m
memory: 500Mi
front:
nginx:
# Set resources for nginx init containers
init:
resources:
limits:
cpu: 1000m
memory: 500Mi
replicas: 2
resources:
limits:
memory: 1Gi
webapps:
public_api:
replicas: 5
resources:
requests:
cpu: 200m
memory: 500Mi
limits:
memory: 4Gi
celeryWorkers:
scanners:
replicas: 8
resources:
requests:
cpu: 200m
memory: 4Gi
limits:
memory: 16Gi
secretEngine:
replicas: 2

See the values reference documentation for further details.

Scaling Recommendation

For optimal performance, consider scaling the following pods to a minimum of 2 replicas each: webapp-hook, webapp-internal-api-long, webapp-public-api, worker-email, worker-long-tasks, and worker-worker.

You can enable workers autoscaling by setting the following Helm values:

celeryWorkers:
worker:
autoscaling:
hpa:
enabled: true
keda:
enabled: false
minReplicas: 1
maxReplicas: 10

You can enable Machine Learning Secret Engine autoscaling by setting the following Helm values:

secretEngine:
autoscaling:
hpa:
enabled: true
keda:
enabled: false
minReplicas: 1
maxReplicas: 2
caution

autoscaling.hpa.enabled and autoscaling.keda.enabled Helm parameters are mutualy exclusive, you must choose between hpa (using Prometheus adapter) and KEDA controller.

Additional tuning ephemeral storage

info

Only available for helm-based installations.

In certain scenarios, optimizing ephemeral storage configurations becomes essential for achieving better performance and stability, particularly for scanners workers. This section outlines additional configurations for fine-tuning ephemeral storage, focusing on leveraging "On Demand" nodes with nvme disks and integrating Generic Ephemeral Inline Volumes.

"On Demand" nodes with nvme disks

In the following example, we specify that scanners workers only use "On Demand" VMs with nvme disks and that pods' ephemeral storage will use these disks

celeryWorkers:
scanners:
replicas: 8
localStoragePath: /nvme/disk # Used for pods ephemeral storage
nodeSelector: # Must run on "On Demand" nodes with nvme disks
eks.amazonaws.com/capacityType: ON_DEMAND
local-nvme-ready: 'true'
tolerations:
- key: worker-highdisk
operator: Equal
value: 'true'
effect: NoSchedule
resources:
requests:
cpu: 200m
memory: 16Gi
limits:
memory: 16Gi

Generic Ephemeral Inline Volumes

In the following example, we leverage Kubernetes' Generic Ephemeral Inline Volumes within Helm charts. This feature facilitates dynamic provisioning and reclamation of storage, particularly beneficial when dealing with small limits on Ephemeral storage. Note that it's supported starting Kubernetes 1.23 (learn more).

celeryWorkers:
scanners:
replicas: 8
resources:
requests:
cpu: 200m
memory: 4Gi
limits:
memory: 16Gi
# -- Worker ephemeral storage
ephemeralStorage:
enabled: true
size: 2Gi

Node Affinity Scheduling

Use the nodeSelector parameter in Helm values to schedule worker pods on specific nodes, ensuring they run in designated zones or meet specific criteria. Learn more

celeryWorkers:
long:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c
scanners:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c
worker:
nodeSelector:
topology.kubernetes.io/zone: eu-central-1c