Skip to content

Interleaving: fast ranker comparison on the same traffic

This page explains Interleaving: fast ranker comparison on the same traffic and how it fits into the RecSys suite.

Who this is for

Engineers comparing two rankers or weight sets.

What you will get

  • What interleaving measures
  • When it is the right tool
  • Common mistakes

What it is

Interleaving mixes two ranked lists (A and B) into one displayed list. Then it attributes user actions (often clicks) back to A or B.

This can be more sensitive than a full A/B when you only care about ranking.

What it is not

Interleaving is not a full product KPI decision engine. It does not account for all downstream effects. Use it to choose between rankers, then validate with A/B.

Inputs

  • ranker_a results (per request_id)
  • ranker_b results (per request_id)
  • outcomes (clicks)

Dataset wiring example: configs/examples/dataset.interleaving.jsonl.yaml

Output

  • A wins / B wins counts
  • win rate and tie rate
  • a significance estimate

Common mistakes

  • comparing rankers trained on different candidate sets without noting it
  • treating interleaving wins as business KPI wins