Pilot plan (2–6 weeks)¶

This guide shows how to pilot plan (2–6 weeks) in a reliable, repeatable way.

Who this is for¶

Product owners, engineering leads, and delivery teams planning a pilot of the RecSys suite.

What you will get¶

A realistic timeline for a first pilot (from “hello world” to production readiness)
Clear deliverables and exit criteria per phase
The minimum instrumentation needed to measure impact

What “success” looks like¶

At the end of a pilot, you should be able to answer:

Can we serve recommendations reliably for our key surfaces (latency, availability, no “empty recs”)?
Can we explain and roll back changes (config/rules/artifacts) without drama?
Can we measure quality and impact from real logs (offline + online)?

If you cannot measure impact, you are not “piloting a recommender” yet—you are only integrating an endpoint.

Prerequisites (non-negotiable)¶

You need the ability to produce:

exposure logs (what was served, with ranks)
outcome logs (what the user did)
stable IDs for joins (a request_id, plus a pseudonymous user_id or session identifier)

See: Data contracts

Fast path: pilot in 2–4 weeks¶

If you already have production-like logging and a team that can move quickly, you can compress the pilot:

Week 1: integrate one surface, validate joins, generate the first report.
Week 2: ship one controlled improvement (rules/constraints or one new signal) and practice rollback once.
Weeks 3–4 (optional): run an online experiment for the key surface with explicit guardrails.

Fast path prerequisites:

You can emit exposure.v1 and outcome.v1 (or can map to them within a week).
You can roll back configuration/rules quickly (no long redeploy cycles).
Someone owns instrumentation quality (join-rate and schema correctness).

Fast path exit criteria (minimum):

Logs validate and join integrity is sane (don’t ship on broken instrumentation).
A report exists for baseline vs candidate and is reproducible.
Rollback has been tested once (config/rules and/or manifest pointer).

Phase 1 (Week 1): baseline + instrumentation¶

Goal: ship a safe baseline and prove you can measure it.

Deliverables:

One surface integrated end-to-end (client → recsys-service → response rendered)
Exposure + outcome logging wired from production-like traffic (even if small)
First recsys-eval report generated from real logs (joins validated)
Runbooks exercised once: “service not ready”, “empty recs”

Recommended scope:

Start in DB-only mode to minimize moving parts.
Use a deterministic baseline algorithm (popularity is fine).

Exit criteria:

recsys-eval validate succeeds for your logs
P95 latency and error rate are within acceptable bounds for your product

Phase 2 (Weeks 2–3): improve relevance safely¶

Goal: add one higher-signal candidate source and gain iteration speed.

Typical upgrades:

similarity/co-visitation signals (often high ROI with modest complexity)
basic business rules (pin/exclude, constraints) for control and trust
segmentation (by surface, locale, tenant, or other stable context keys)

Deliverables:

A second algorithm/config version evaluated against baseline (offline)
A small “ship checklist” used before rollout (what changed, how to roll back)
A rollback drill completed once (config/rules and/or artifact manifest)

Exit criteria:

Offline evaluation shows a consistent improvement (or a clear tradeoff you accept)
You can roll back within minutes with a known procedure

Phase 3 (Weeks 4–6): production hardening + experimentation¶

Goal: move from “works” to “operationally safe”.

Typical additions:

artifact/manifest mode (pipelines publish versioned artifacts; service reads a manifest pointer)
A/B experiments for key surfaces with clear guardrails
SLOs and alerting: latency, error rate, empty-recs rate, artifact freshness

Deliverables:

A documented on-call playbook and escalation path
Production readiness checklist completed
One controlled experiment run (even if only to validate instrumentation)

Exit criteria:

On-call can triage common failures from runbooks
Changes are shipped with gates and are reversible

Pilot plan (2–6 weeks)¶

Who this is for¶

What you will get¶

What “success” looks like¶

Prerequisites (non-negotiable)¶

Fast path: pilot in 2–4 weeks¶

Phase 1 (Week 1): baseline + instrumentation¶

Phase 2 (Weeks 2–3): improve relevance safely¶

Phase 3 (Weeks 4–6): production hardening + experimentation¶

Read next¶