CLI: recsys-eval¶
This page is the canonical reference for CLI: recsys-eval.
Who this is for¶
- Engineers running
recsys-evallocally or in CI - Teams implementing evaluation gates (ship/hold/fail) based on reports
What you will get¶
- The canonical
recsys-evalcommands, flags, and exit codes - Copy/paste examples for local runs and CI gating
Build/install¶
From repo root:
(cd recsys-eval && make build)
Binary:
recsys-eval/bin/recsys-eval
Commands¶
recsys-eval run¶
Runs one evaluation mode and writes a report.
Required flags:
--dataset <path.yaml>: dataset config (YAML)--config <path.yaml>: eval config (YAML)--output <path>: output report path
Common optional flags:
--mode <offline|experiment|ope|interleaving|aa-check>: overridesmode:in config--output-format <json|markdown|html>: default isjson--baseline <path.json>: offline mode baseline report (JSON) for comparisons--experiment-id <id>: experiment mode override
recsys-eval validate¶
Validates a JSONL file against a schema.
--schema <name-or-path>: schema name (likeexposure.v1) or a path to a.jsonschema file--input <path.jsonl>: JSONL file to validate
Notes:
- If
--schemadoes not end with.json,recsys-evalresolves it asschemas/<schema>.json. That means--schema exposure.v1works when run from therecsys-eval/directory.
recsys-eval version¶
Prints the CLI version.
Exit codes¶
0: success (andshipdecision in experiment mode)1: command failed (invalid input/config, validation errors, runtime errors)2: experiment decision ishold3: experiment decision isfail
Examples¶
Local: validate + offline report¶
(cd recsys-eval && ./bin/recsys-eval validate --schema exposure.v1 --input /tmp/exposures.jsonl)
(cd recsys-eval && ./bin/recsys-eval validate --schema outcome.v1 --input /tmp/outcomes.jsonl)
recsys-eval/bin/recsys-eval run \
--mode offline \
--dataset /tmp/dataset.yaml \
--config /tmp/eval.yaml \
--output /tmp/recsys_eval_report.md \
--output-format markdown
CI: experiment gate (ship/hold/fail)¶
set +e
recsys-eval/bin/recsys-eval run \
--mode experiment \
--dataset /tmp/dataset.yaml \
--config /tmp/experiment.yaml \
--output /tmp/recsys_eval_report.json \
--output-format json
code="$?"
set -e
case "$code" in
0) echo "decision=ship" ;;
2) echo "decision=hold" ; exit 0 ;;
3) echo "decision=fail" ; exit 1 ;;
*) echo "decision=error" ; exit 1 ;;
esac
Read next¶
- Default evaluation pack: Default evaluation pack (recommended)
- Run eval and ship decisions: How-to: run evaluation and make ship decisions
- Data contracts (schemas + examples): Data contracts