Workflow: Offline gate in CI¶
This page explains Workflow: Offline gate in CI and how it fits into the RecSys suite.
Who this is for¶
- Engineers wiring
recsys-evalinto CI/CD as a quality gate - Teams that want an auditable “ship / hold / rollback” decision trail
Goal¶
Fail builds when recommendation quality regresses beyond agreed thresholds, using an offline regression report.
The workflow (recommended baseline)¶
This is the simplest reliable pattern:
- Validate inputs (schemas + joins).
- Run
recsys-evalinofflinemode. - Attach the report to the build (artifact).
- Fail CI when gates fail (deterministically).
Inputs you need¶
- A dataset config (what files to read and how to join them)
- An evaluation config (metrics, slices, thresholds)
- A baseline report (committed “golden” or a pinned prior run)
Example command (CI gate)¶
recsys-eval run \
--mode offline \
--dataset configs/examples/dataset.jsonl.yaml \
--config configs/eval/offline.ci.yaml \
--output /tmp/offline_report.json \
--baseline testdata/golden/offline.json
Practical gating guidance¶
- Use tiny “golden” datasets for behavior regression tests (fast, stable).
- Use real logs for scheduled production gates (high signal).
- Treat “invalid input” differently from “metric regression”:
- fix logging before trusting metrics
- keep gates deterministic and auditable
Read next¶
- CI gates (details + exit codes): CI gates: using recsys-eval in automation
- Metrics reference: Metrics: what we measure and why
- Interpreting results: Interpreting results: how to go from report to decision
- Suite workflow: How-to: run evaluation and make ship decisions