Skip to content

How-to: Operate pipelines daily

This guide shows how to how-to: Operate pipelines daily in a reliable, repeatable way.

Who this is for

  • SRE / on-call running recsys-pipelines on a schedule
  • Data engineers responsible for freshness and correctness

Goal

Run pipelines predictably, detect staleness early, and respond to failures using the right runbook.

Quick paths

Daily checklist (practical)

  1. Confirm the expected schedule and windowing

  2. If you run nightly/daily: verify --start/--end semantics and UTC windows.

  3. If you run incremental: ensure checkpoint_dir is stable across runs.

  4. Run and publish (or confirm the scheduler did)

  5. Primary: recsys-pipelines run ... --incremental
    See: How-to: Run incremental pipelines

  6. Validate outputs and “current” pointers

  7. Check expected files/paths: Output layout (local filesystem)

  8. Confirm manifest pointer updated only when validation passes.

  9. Watch freshness/SLOs

  10. Use the invariants and expected freshness windows: SLOs and freshness

  11. When something fails, open the closest runbook first

  12. Pipeline failed: Runbook: Pipeline failed

  13. Validation failed: Runbook: Validation failed
  14. Stale artifacts: Runbook: Stale artifacts
  15. Limit exceeded: Runbook: Limit exceeded