How-to: Operate pipelines daily¶
This guide shows how to how-to: Operate pipelines daily in a reliable, repeatable way.
Who this is for¶
- SRE / on-call running
recsys-pipelineson a schedule - Data engineers responsible for freshness and correctness
Goal¶
Run pipelines predictably, detect staleness early, and respond to failures using the right runbook.
Quick paths¶
- Schedule runs: How-to: Schedule pipelines with CronJob
- Incremental runs: How-to: Run incremental pipelines
- Debug failures: How-to: Debug a failed pipeline run
- SLOs and freshness: SLOs and freshness
- Runbooks:
- Pipeline failed: Runbook: Pipeline failed
- Validation failed: Runbook: Validation failed
- Stale artifacts: Runbook: Stale artifacts
- Limit exceeded: Runbook: Limit exceeded
Daily checklist (practical)¶
-
Confirm the expected schedule and windowing
-
If you run nightly/daily: verify
--start/--endsemantics and UTC windows. -
If you run incremental: ensure
checkpoint_diris stable across runs. -
Run and publish (or confirm the scheduler did)
-
Primary:
recsys-pipelines run ... --incremental
See: How-to: Run incremental pipelines -
Validate outputs and “current” pointers
-
Check expected files/paths: Output layout (local filesystem)
-
Confirm manifest pointer updated only when validation passes.
-
Watch freshness/SLOs
-
Use the invariants and expected freshness windows: SLOs and freshness
-
When something fails, open the closest runbook first
-
Pipeline failed: Runbook: Pipeline failed
- Validation failed: Runbook: Validation failed
- Stale artifacts: Runbook: Stale artifacts
- Limit exceeded: Runbook: Limit exceeded
Read next¶
- Start here: Start here
- Config reference: Config reference
- Exit codes: Exit codes