Skip to content

Glossary

A small shared vocabulary used throughout the suite docs.

Artifact

An immutable, version-addressed blob produced offline (for example: popularity, co-visitation, embeddings) and consumed by recsys-service.

Artifact/manifest mode

A deployment mode where recsys-pipelines publishes versioned artifacts and a mutable “current manifest pointer”, and recsys-service reads the current manifest and fetches referenced blobs.

See: Data modes: DB-only vs artifact/manifest

DB-only mode

A deployment mode where the service reads signals directly from Postgres tables (no offline artifact publish step).

DB-only mode is useful for early pilots because it minimizes moving parts, but it trades off offline reproducibility and versioned “ship/rollback” via manifests.

See: Data modes: DB-only vs artifact/manifest

Manifest

A small document that maps artifact types to artifact URIs for a (tenant, surface) pair. In artifact mode, the “current manifest pointer” is what you ship and roll back.

Freshness

An operational concept: “are recommendations based on recent-enough data?” In artifact/manifest mode, freshness is often defined as “the current manifest was updated within an expected window”.

Candidate

An item considered for ranking. Candidate generation prioritizes recall: “what items should we even consider?”

Ranking

Scoring and ordering candidates to produce the final top-K list. Ranking prioritizes precision: “which of these are best?”

Exposure

An event that records what items were shown (and in what order) for a single recommendation request.

Exposure log

The stream/file of exposure events written by recsys-service (typically JSONL). This is an input to recsys-eval and one of the main audit artifacts for “what was shown”.

See: Exposure logging and attribution

Outcome

An event that records what the user did after an exposure (click, conversion, etc.). Outcomes are attributed to exposures by request_id.

Outcome log

The stream of outcome events emitted by your product (clicks, conversions, etc.) and joined to exposures by request_id.

See: Exposure logging and attribution

Assignment

An event that records experiment bucket membership (experiment id + variant) used for A/B analysis.

Request ID

A correlation identifier that ties together “recommend” responses, exposure logs, and outcome events. In this suite it is the primary join key for evaluation.

Segment

A cohort/slice label (default: default) used for rule scoping and evaluation breakdowns (guest vs returning, locale, etc.).

Control plane

The admin APIs and versioned documents (tenant config + rules) that control serving behavior without redeploying code.

Ship

Promote a candidate change to “current” in a safe, reversible way (for example: update the manifest pointer, or update tenant config/rules versions).

Rollback

Revert “current” to a last-known-good version (config/rules rollback or manifest pointer rollback).

Namespace

An application-defined bucket of recommendation logic/data. In this suite, surface typically acts as the namespace for signals and rules.

Surface

Where recommendations are shown (home, PDP, cart, …). Surface names should be stable; they scope signals, rules, and evaluation slices.

Tenant

An organization boundary for configuration and data isolation.

recsys-svc

The Docker Compose container name used in this repo for the recsys-service API. In docs and architecture discussions, prefer the module name recsys-service.

Canonical events

Events stored in a normalized format that the rest of the pipeline relies on.

Window

A time range that a job processes. In v1, windows are daily UTC buckets.

Version

A deterministic identifier (SHA-256 hex) of an artifact payload excluding volatile build metadata.

Checkpoint

A small state file that records the latest successfully processed window so incremental runs can skip work already done.

Incremental run

A run mode that processes only new windows since the last checkpoint.

Backfill

Re-processing a historical range of windows.

Current manifest pointer

The mutable “what is live right now” location for a (tenant, surface) manifest.

Registry

Storage for artifact records and current manifests.

Object store

Storage for artifact blobs. In local mode, this is the filesystem.

Idempotent

Safe to run multiple times without changing the result.