recsys-eval docs¶
recsys-eval turns recommendation logs into evaluation reports so you can make clear decisions: ship / hold / rollback.
If you only read one page first: Concepts → Concepts: how to understand recsys-eval
Choose your path¶
I’m new and need the big picture¶
- Overview: recsys-eval
- Concepts: Concepts: how to understand recsys-eval
- Workflows:
- Offline gate in CI: Workflow: Offline gate in CI
- Online A/B in production: Workflow: Online A/B analysis in production
I’m integrating data/logs¶
- Data contracts: Data contracts: what inputs look like
- Integration guide: Integration: how to produce the inputs
I’m interpreting results¶
- Metrics reference: Metrics: what we measure and why
- Interpretation guide: Interpreting results: how to go from report to decision
- Interpretation cheat sheet: Interpretation cheat sheet (recsys-eval)
I’m running this in CI or on-call¶
- CI gates: CI gates: using recsys-eval in automation
- Scaling: Scaling: large datasets and performance
- Runbooks: Runbooks: operating recsys-eval
- Troubleshooting: Troubleshooting: symptom -> cause -> fix
I’m doing a deeper evaluation method¶
- OPE (off-policy evaluation): Off-policy evaluation (OPE): powerful and easy to misuse
- Interleaving: Interleaving: fast ranker comparison on the same traffic
- Architecture: Architecture: how the code is organized and how to extend it
Read next¶
- Suite workflow: How-to: run evaluation and make ship decisions
- Evaluation modes: Evaluation modes