Error handling & troubleshooting API calls¶
This page is the canonical reference for Error handling & troubleshooting API calls.
Who this is for¶
Integrators and on-call engineers who need to diagnose failed requests quickly and safely.
What you will get¶
- The error format (
application/problem+json) - What the common HTTP status codes mean in this service
- Endpoint-specific “what to do next”
Error format (Problem Details)¶
Errors use RFC 7807 with content type application/problem+json.
Typical fields:
type,title,status(always present)detail(human-readable explanation; should be safe to show to end users)code(machine-readable error category, when available)request_id(for correlation)
Example:
{
"type": "about:blank",
"title": "validation failed",
"status": 422,
"detail": "surface is required",
"code": "VALIDATION_FAILED",
"request_id": "a2e38779dfbe/nmiLt982Mq-000004"
}
Operational tip: include request_id in client logs and support tickets. It is the fastest way to find the matching server log line.
Common status codes (what to do)¶
- 400: invalid JSON or wrong content type
- Check you send
Content-Type: application/jsonand valid JSON. - 401/403: authentication/authorization/tenant scope failure
- Check auth headers, tenant header (
X-Org-Id), and role requirements for admin endpoints. - 404 (admin endpoints): tenant not found
- Tenant creation is DB-only today; see Admin API + local bootstrap (recsys-service) and Known limitations and non-goals (current).
- 409 (admin
PUT): optimistic concurrency conflict - Fetch the latest resource, take its ETag, and retry with
If-Match. - 422: validation failure (semantically invalid request)
- Call
POST /v1/recommend/validateto see the normalized request + warnings. - 429: rate limited
- Back off and retry. If this is unexpected, review per-tenant rate limits.
- 503: not ready / overloaded
- Check
GET /readyzand verify dependencies (DB, artifact store if enabled). - 500: internal error
- Use
request_idto locate server logs; follow the relevant runbook.
Endpoint notes¶
POST /v1/recommend (and POST /v1/similar)¶
Expected error responses (see OpenAPI):
400,401,403,422,429,500,503
What to do first:
- Call
POST /v1/recommend/validatewith the same payload to surface normalization and warnings. - Confirm tenant scope (JWT claims or
X-Org-Idheader). - If you see empty results, use the “empty recs” runbook: Runbook: Empty recs
POST /v1/recommend/validate¶
This endpoint is your fastest “is my request shape sane?” tool.
Expected error responses:
400,401,403,422,429
If you get 422, fix the request payload before calling /v1/recommend.
Admin endpoints (/v1/admin/...)¶
These endpoints are for operators (config, rules, cache invalidation, audit).
Common pitfalls:
401/403: missing operator/admin privileges (or dev auth is not enabled)409onPUT: you updated with a stale version; retry with the latestIf-Match404: tenant does not exist (bootstrap is DB-first today)
See: Admin API + local bootstrap (recsys-service)
Health endpoints (GET /healthz, GET /readyz)¶
/healthzis a liveness probe (is the process up?)/readyzis a readiness probe (are dependencies reachable?)
If /readyz returns 503, use the “service not ready” runbook: Runbook: Service not ready
Read next¶
- API reference: API Reference
- Admin API: Admin API + local bootstrap (recsys-service)
- Auth and tenancy: Auth and tenancy reference