Cookbook: integrate with an event bus (streaming)¶
This guide shows how to cookbook: integrate with an event bus (streaming) in a reliable, repeatable way.
Who this is for¶
Platform engineers publishing exposure/outcome events into Kafka/Kinesis/PubSub and building downstream datasets.
What you will get¶
- A minimal architecture for correct attribution with an event bus
- A request_id strategy that survives retries and at-least-once delivery
- Verification checks that catch broken joins early
Goal¶
Publish exposure and outcome events to an event bus (Kafka/Kinesis/PubSub) so you can:
- build evaluation datasets reliably
- run
recsys-pipelineson a schedule or stream - debug and roll back with a clear audit trail
Minimal architecture¶
flowchart LR
A[App / Edge] -->|/v1/recommend| S[recsys-service]
S -->|response + request_id| A
A -->|exposure event| B[(Event bus)]
A -->|outcome event| B
B --> W[(Warehouse / object store)]
W --> E[recsys-eval]
W --> P[recsys-pipelines] Steps¶
- Choose where
request_idis generated - Client-generated: set
X-Request-Idon the API call, reuse for outcomes. -
Server-generated: read
meta.request_idfrom the response, reuse for outcomes. -
Publish exposure events only after render
-
Avoid logging prefetches; they destroy attribution quality.
-
Publish outcomes with the same
request_id -
A retry that creates a new
request_idis the fastest way to break joins. -
Handle duplicates
- Most buses are at-least-once. Deduplicate downstream by
(request_id, item_id, event_type, ts)(or your equivalent).
Verify¶
- Validate event schemas at the edge (or immediately in your stream processor).
- Monitor join rates and missing fields:
- outcomes missing
request_id - outcomes missing
item_id - exposures with empty
items[]
Pitfalls¶
- Out-of-order delivery
- Outcomes may arrive before exposures. Design your joins to handle late data.
- Multi-surface reuse
- Do not reuse
request_idacross different surfaces/modules. - PII in event payloads
- Keep user identity pseudonymous; do not include raw email/phone.
Read next¶
- Exposure logging & attribution: Exposure logging and attribution
- Event join logic: Event join logic (exposures ↔ outcomes ↔ assignments)
- Troubleshoot integration failures: How-to: troubleshooting for integrators