How to reproduce the baseline benchmarks¶
This guide shows how to reproduce the baseline measurements in a repeatable way, so you can compare changes and environments.
1) Start a benchmark-friendly local stack¶
The dev stack includes a small global rate limit (30 burst / 15 rps) for safety. For benchmarking, you can bypass it via a skip header.
Create a benchmark env file:
test -f api/.env || cp api/.env.example api/.env
cp api/.env api/.env.benchmark
Then update api/.env.benchmark:
- Disable per-tenant rate limiting:
TENANT_RATE_LIMIT_ENABLED=false- Enable global rate limit bypass for local benchmarking only:
RATE_LIMIT_SKIP_ENABLED=trueRATE_LIMIT_SKIP_HEADER=X-RateLimit-SkipRATE_LIMIT_ALLOW_DANGEROUS_DEV_BYPASSES=trueTRUSTED_PROXIES=0.0.0.0/0,::/0
Warning
Do not enable the global rate limit bypass in production. It is intended only for local/test environments.
Start the stack:
RECSYS_API_ENV_FILE=./api/.env.benchmark make cycle
2) Seed reproducible demo data/artifacts¶
RECSYS_API_ENV_FILE=./api/.env.benchmark ./scripts/demo.sh
3) Run the load test¶
X-RateLimit-Skip: true must be sent on requests when the bypass is enabled.
API_KEY_HEADER=X-RateLimit-Skip API_KEY=true \
BASE_URL=http://localhost:8000 ENDPOINT=/v1/recommend \
TENANT_ID=demo SURFACE=home K=20 \
REQUESTS=5000 CONCURRENCY=50 \
./scripts/loadtest.sh
Recording template¶
Use this table as a living record (commit it as a PR comment or internal doc when you run the benchmark):
| Date | Env | Dataset | Endpoint | k | c | n | rps | p95 | Notes |
|---|---|---|---|---|---|---|---|---|---|
| YYYY-MM-DD | local docker | demo | /v1/recommend | 20 | |||||
| YYYY-MM-DD | staging/prod | real | /v1/recommend | 20 |