CI gates: using recsys-eval in automation¶

This page explains CI gates: using recsys-eval in automation and how it fits into the RecSys suite.

Who this is for¶

Engineers wiring recsys-eval into CI/CD or scheduled pipelines.

What you will get¶

A practical gating pattern
How to use exit codes
How to store artifacts and compare runs

The pattern: validate -> run -> store report -> gate¶

1) Validate data (optional but recommended) 2) Run evaluation 3) Upload report artifact 4) Fail the pipeline if gates fail

Example (tiny dataset gate used in CI):

recsys-eval run \
  --mode offline \
  --dataset configs/examples/dataset.jsonl.yaml \
  --config configs/eval/offline.ci.yaml \
  --output /tmp/offline_report.json \
  --baseline testdata/golden/offline.json

Exit codes¶

recsys-eval is designed to be automation-friendly:

configuration or schema errors should fail fast
gate failures should fail deterministically

Recommended practice:

treat "invalid input" differently from "metric regression"

If your build supports a decision artifact:

fail if decision != ship
attach decision.json and report.json to the build

Artifact storage¶

Store:

report.json
effective config (or config hash)
dataset fingerprint / window
the exact binary version (build info)

This is what makes runs auditable.

Golden tests vs production gates¶

Golden tests:

use tiny datasets
protect behavior and output stability

Production gates:

use real logs
protect business impact and safety

Do not confuse the two. Use both.