Skip to content

How to reproduce the baseline benchmarks

This guide shows how to reproduce the baseline measurements in a repeatable way, so you can compare changes and environments.

1) Start a benchmark-friendly local stack

The dev stack includes a small global rate limit (30 burst / 15 rps) for safety. For benchmarking, you can bypass it via a skip header.

Create a benchmark env file:

test -f api/.env || cp api/.env.example api/.env
cp api/.env api/.env.benchmark

Then update api/.env.benchmark:

  • Disable per-tenant rate limiting:
  • TENANT_RATE_LIMIT_ENABLED=false
  • Enable global rate limit bypass for local benchmarking only:
  • RATE_LIMIT_SKIP_ENABLED=true
  • RATE_LIMIT_SKIP_HEADER=X-RateLimit-Skip
  • RATE_LIMIT_ALLOW_DANGEROUS_DEV_BYPASSES=true
  • TRUSTED_PROXIES=0.0.0.0/0,::/0

Warning

Do not enable the global rate limit bypass in production. It is intended only for local/test environments.

Start the stack:

RECSYS_API_ENV_FILE=./api/.env.benchmark make cycle

2) Seed reproducible demo data/artifacts

RECSYS_API_ENV_FILE=./api/.env.benchmark ./scripts/demo.sh

3) Run the load test

X-RateLimit-Skip: true must be sent on requests when the bypass is enabled.

API_KEY_HEADER=X-RateLimit-Skip API_KEY=true \
BASE_URL=http://localhost:8000 ENDPOINT=/v1/recommend \
TENANT_ID=demo SURFACE=home K=20 \
REQUESTS=5000 CONCURRENCY=50 \
./scripts/loadtest.sh

Recording template

Use this table as a living record (commit it as a PR comment or internal doc when you run the benchmark):

Date Env Dataset Endpoint k c n rps p95 Notes
YYYY-MM-DD local docker demo /v1/recommend 20
YYYY-MM-DD staging/prod real /v1/recommend 20