Skip to content

recsys-eval docs¶

recsys-eval turns recommendation logs into evaluation reports so you can make clear decisions: ship / hold / rollback.

If you only read one page first: Concepts → Concepts: how to understand recsys-eval

Choose your path¶

I’m new and need the big picture¶

Overview: recsys-eval
Concepts: Concepts: how to understand recsys-eval
Workflows:
Offline gate in CI: Workflow: Offline gate in CI
Online A/B in production: Workflow: Online A/B analysis in production

I’m integrating data/logs¶

Data contracts: Data contracts: what inputs look like
Integration guide: Integration: how to produce the inputs

I’m interpreting results¶

Metrics reference: Metrics: what we measure and why
Interpretation guide: Interpreting results: how to go from report to decision
Interpretation cheat sheet: Interpretation cheat sheet (recsys-eval)

I’m running this in CI or on-call¶

CI gates: CI gates: using recsys-eval in automation
Scaling: Scaling: large datasets and performance
Runbooks: Runbooks: operating recsys-eval
Troubleshooting: Troubleshooting: symptom -> cause -> fix

I’m doing a deeper evaluation method¶

OPE (off-policy evaluation): Off-policy evaluation (OPE): powerful and easy to misuse
Interleaving: Interleaving: fast ranker comparison on the same traffic
Architecture: Architecture: how the code is organized and how to extend it

Read next¶

Suite workflow: How-to: run evaluation and make ship decisions
Evaluation modes: Evaluation modes