Skip to content

Workflow: Online A/B analysis in production

This page explains Workflow: Online A/B analysis in production and how it fits into the RecSys suite.

Who this is for

  • Product + analytics teams running experiments on key surfaces
  • Engineers who need a repeatable “measure → decide → ship/rollback” workflow

Goal

Measure impact from live traffic and decide ship / hold / rollback using experiment analysis.

Prerequisites (must be true)

  • You can log:
  • exposures (what was shown)
  • outcomes (what the user did)
  • assignments (experiment id + variant)
  • Your join keys are stable (typically request_id).

Start here if anything is unclear:

Workflow steps

  1. Pick a primary KPI and 2–4 guardrails (latency, empty-recs rate, error rate, etc.).
  2. Run recsys-eval in experiment mode for a well-defined window.
  3. Interpret results:
  4. join-rate sanity
  5. SRM (sample ratio mismatch) warnings
  6. guardrails holding
  7. Decide ship/hold/rollback and save the report as an audit artifact.

Example command (experiment analysis)

recsys-eval run \
  --mode experiment \
  --dataset configs/examples/dataset.jsonl.yaml \
  --config configs/eval/experiment.default.yaml \
  --output /tmp/experiment_report.md \
  --output-format markdown