Evaluation

Turn recommendation changes into defensible decisions.

A recommendation rollout should not depend on hope. RecSys keeps request IDs, exposure logs, outcome joins, reports, and rollback levers in the same operating model.

Decision model

Interpret metrics only after data integrity is credible.

Validate schemas

Exposure, outcome, assignment, report, and decision schemas are checked before reports are trusted.

Measure join integrity

Stable request IDs connect recommendation responses, exposure logs, and outcomes.

Keep guardrails visible

Error rate, latency, empty recommendations, and warning rates define the operational envelope.

Proof kit

A compact checked path for evaluator confidence.

The repository includes a commercial proof-kit smoke path that validates ecommerce fixtures, writes offline reports, runs pipelines, and checks published artifacts.

Ship / hold / rollback

Make rollout choices explicit.

Ship

Primary KPI clears the pre-agreed effect, slices are stable, guardrails hold, and rollback is ready.

Hold

Data integrity is weak, results are inconclusive, or the rollback path is not ready.

Roll back

KPI regresses materially, guardrails breach, or the operational envelope no longer holds.

Next step

Need to prove the first rollout?

Use the evaluation path to define evidence, then contact us when the pilot needs commercial or procurement support.