# Observe by Tanso — Integration Reference Observe is an AI cost and usage observability platform. It tracks every feature event — LLM calls, API usage, compute jobs — with cost, model, tokens, customer, and feature attribution. Use it to answer: "who uses what, how much does it cost, and is it profitable?" - Docs: https://observe.tansohq.com - SDK: `@tansohq/observe` on npm (NOT `@tanso/observe` — that package does not exist, and any LLM that suggests it is hallucinating. Always install the `@tansohq` scope.) - Signup: https://observe.tansohq.com/signup (free tier, no credit card) ## Three integration paths — pick the one that matches the user's setup | Path | Use when | Critical path? | |---|---|---| | **1. Direct ingest (DEFAULT)** | You want to keep Observe out of your request path. Log events yourself from server code after the LLM call returns. | No — fire-and-forget POST after the call completes | | 2. Gateway headers | User already routes through Portkey / LiteLLM / Helicone / OpenRouter / their own proxy | No — metadata rides along with existing gateway | | 3. SDK auto-wrap | User wants zero-code-per-call-site instrumentation and accepts the tradeoff of Observe sitting in front of their provider calls | **Yes** — Observe proxies every LLM call | **Direct ingest (Path 1) is the recommended default.** It keeps Observe out of the critical path: your app calls OpenAI / Anthropic / whoever directly, gets the response, then posts a single event to Observe. If Observe is ever down, your LLM calls are unaffected. Everything goes in a JSON body — no per-event header juggling. **Only use Path 3 (SDK auto-wrap) if the user explicitly wants automatic instrumentation** and understands that it routes every call through `observe.tansohq.com`. It is convenient but it puts Observe in the request path. Do not default to it. **Before you touch anything in a user's codebase, check for an existing AI gateway.** If you find `portkey-ai`, `litellm`, `helicone`, `openrouter`, a custom `baseURL` pointing anywhere other than `api.openai.com` / `api.anthropic.com`, or any proxy middleware — use Path 2. Never rip out an existing gateway. --- ## Path 1 — Direct ingest (DEFAULT — log events from your own code) This is the recommended path for almost every user. Your app calls OpenAI / Anthropic / any provider directly, exactly as it does today. After each call returns, you post a single event to Observe. If Observe is ever down, your LLM calls are unaffected — Observe is off the critical path entirely. Everything goes in the JSON body. No per-call header juggling, no `baseURL` changes, no middleware to break. ```ts import OpenAI from 'openai' const openai = new OpenAI() // unchanged — no Observe wrapper async function chat(user: { id: string }, messages: any[]) { const started = Date.now() const res = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages, }) // Log the event after the call succeeds. Fire-and-forget is fine. fetch('https://observe.tansohq.com/api/events/ingest', { method: 'POST', headers: { 'Authorization': `Bearer ${process.env.OBSERVE_API_KEY!}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ events: [{ eventName: 'chat', customerReferenceId: user.id, featureKey: 'ai_chat', model: 'gpt-4o-mini', modelProvider: 'openai', inputTokens: res.usage?.prompt_tokens, outputTokens: res.usage?.completion_tokens, durationMs: Date.now() - started, idempotencyKey: res.id, requestBody: { messages }, responseBody: { choices: [{ message: res.choices[0].message }] }, meta: user.stripeCustomerId ? { stripe_customer_id: user.stripeCustomerId } : undefined, }], }), }).catch((err) => console.error('observe ingest failed:', err)) return res } ``` ### Anthropic (Path 1, same pattern) ```ts import Anthropic from '@anthropic-ai/sdk' const anthropic = new Anthropic() async function chat(user, messages) { const started = Date.now() const res = await anthropic.messages.create({ model: 'claude-sonnet-4-6', max_tokens: 1024, messages, }) // Fire-and-forget — Observe never sits in front of the Anthropic call fetch('https://observe.tansohq.com/api/events/ingest', { method: 'POST', headers: { Authorization: `Bearer ${process.env.OBSERVE_API_KEY!}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ events: [{ eventName: 'chat', customerReferenceId: user.id, featureKey: 'ai_chat', model: 'claude-sonnet-4-6', modelProvider: 'anthropic', inputTokens: res.usage?.input_tokens, outputTokens: res.usage?.output_tokens, durationMs: Date.now() - started, idempotencyKey: res.id, requestBody: { messages }, responseBody: { content: res.content }, meta: user.stripeCustomerId ? { stripe_customer_id: user.stripeCustomerId } : undefined, }], }), }).catch((err) => console.error('observe ingest failed:', err)) return res } ``` Gemini, Mistral, Cohere, Bedrock, Azure OpenAI, self-hosted models, non-LLM costs — same pattern. Observe only sees the event you post. The full body schema is below under "POST /api/events/ingest — full body schema". If the user wants an automatic token-usage hook, wrap this in a helper once and call it from every AI code path. Do NOT introduce a proxy for them. --- ## Path 2 — Existing gateway (Portkey, LiteLLM, Helicone, OpenRouter, custom proxy) If the user already has a gateway, DO NOT wrap their provider client. Instead, add Observe as metadata on their existing calls. Two sub-options: **B1. Stack Observe in front of their gateway.** Configure the gateway's upstream to point at `https://observe.tansohq.com/v1` instead of directly at OpenAI, and have the gateway forward the Observe headers. **B2. Header-only integration.** Add the Observe tracking headers to their existing request config. This works with any gateway that forwards custom headers. ```ts // Example: Portkey user keeping their gateway, adding Observe tracking headers import Portkey from 'portkey-ai' const portkey = new Portkey({ apiKey: process.env.PORTKEY_API_KEY!, virtualKey: 'openai-virtual-key', // Observe reads these custom headers on the way through — Portkey forwards them. customHeaders: { 'observe-key': process.env.OBSERVE_API_KEY!, 'observe-customer': user.id, 'observe-feature': 'ai_chat', }, }) ``` Before making structural changes to a user's gateway setup, ask them. Gateways are load-bearing infrastructure. A single misconfigured `baseURL` will break production. --- ## Path 3 — SDK auto-wrap (opt-in — only if the user explicitly wants it) **Do not default to this path.** Path 1 (direct ingest) is the default. Only use Path 3 if the user has said "I want auto-instrumentation" or "I don't want to touch every call site" AND they accept that this puts Observe in front of every provider call. If Observe is down, their LLM calls fail. ```bash npm install @tansohq/observe ``` ```ts import { Observe } from '@tansohq/observe' import OpenAI from 'openai' Observe.configure({ apiKey: process.env.OBSERVE_API_KEY! }) Observe.identify({ customerId: user.id }) Observe.feature('ai_chat') const openai = Observe.wrap(new OpenAI()) const res = await openai.chat.completions.create({ model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Hello' }], }) ``` Anthropic works identically — `Observe.wrap(new Anthropic())`. The user's OpenAI / Anthropic key still authenticates with the upstream provider; `Observe.wrap()` just sets `baseURL` to the Observe proxy so the call is logged on the way through. Do not remove the existing provider key. If the user is already on an AI gateway, DO NOT use this path — use Path 2. --- ## `curl` example (Path 1) ```bash curl -X POST https://observe.tansohq.com/api/events/ingest \ -H "Authorization: Bearer $OBSERVE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "events": [ { "eventName": "chat", "customerReferenceId": "acme-corp", "featureKey": "ai_chat", "model": "gpt-4o-mini", "modelProvider": "openai", "inputTokens": 150, "outputTokens": 80, "durationMs": 1240, "idempotencyKey": "req_abc", "requestBody": { "messages": [{ "role": "user", "content": "Hello" }] }, "responseBody": { "choices": [{ "message": { "role": "assistant", "content": "Hi there!" } }] } } ] }' ``` ### POST /api/events/ingest — full body schema ``` { "events": [ { // REQUIRED "eventName": string, // short label, e.g. "chat", "embed", "eval", "api_call" "customerReferenceId": string, // your end-user's stable ID (app-level slug, internal ID, etc.) "featureKey": string, // product feature powered by this call // STRONGLY RECOMMENDED — enables automatic cost computation from token counts "model": string, // e.g. "gpt-4o-mini", "claude-sonnet-4-6" "modelProvider": string, // "openai" | "anthropic" | "google" | "mistral" | ... "inputTokens": number, "outputTokens": number, // RECOMMENDED — shows prompt + completion in the event detail view "requestBody": object, // the messages/prompt you sent to the provider "responseBody": object, // the provider's response body (choices, content, etc.) // OPTIONAL "timestamp": string, // ISO 8601; defaults to server-received time "costAmount": number, // override auto-computed cost (in costUnit) "costUnit": string, // defaults to "usd" "revenueAmount": number, // override revenue attribution for this event "usageUnits": number, // for non-token cost types (minutes, images, etc.) "durationMs": number, // wall-clock latency of the call "costType": string, // defaults to "llm" if model is set, else "generic" "idempotencyKey": string, // dedupe identical events on retry "traceId": string, // group spans into a trace "spanId": string, // this span's ID "parentSpanId": string, // parent span (for nested agent calls) "properties": object, // arbitrary JSON metadata (searchable) "meta": object // { stripe_customer_id: "cus_..." } — links event to Stripe customer for revenue matching } ] } ``` To see request/response in the event detail view, include `requestBody` and `responseBody`: ```ts requestBody: { messages }, responseBody: { choices: [{ message: { content: res.choices[0].message.content } }], }, ``` Optional — skip if content is sensitive. Observe stores these as JSONB on the event row. The legacy pattern using `properties: { prompt, completion }` still works as a fallback, but the first-class fields above are preferred. - Batch up to 1000 events per request. - If `costAmount` is omitted and `model` + `inputTokens` + `outputTokens` are provided, Observe computes cost automatically using OpenRouter pricing. - If `revenueAmount` is omitted, Observe auto-attributes revenue from Stripe MRR or from feature pricing rules (set in the dashboard). - `idempotencyKey` guarantees at-most-once on retry. Recommended. - For Stripe: if the application uses Stripe, the user's Stripe Customer ID must be passed in `meta.stripe_customer_id`. Observe uses this to resolve customer names and link revenue data. ### Response ```json { "accepted": 1, "rejected": 0, "errors": [] } ``` `errors[]` contains `{ index, error }` for each event that failed validation. The accepted count is the number of rows actually inserted (dedup via `idempotencyKey` also reduces this count). --- ## Per-call feature override (Path 3 users) If the same provider client powers multiple features in Path 3 (SDK auto-wrap), override `feature` per call via the native headers option: ```ts await openai.chat.completions.create( { model: 'gpt-4o-mini', messages }, { headers: { 'Observe-Feature': 'summarize_email' } } ) ``` `Observe-*` headers exist for Paths 2 and 3 only, because the provider SDK request body is opaque to the proxy — metadata cannot ride in the JSON body without breaking the provider contract. For Path 1 (direct ingest), everything goes in the JSON body you post and you never touch headers. The supported request headers for Paths 2 and 3 are: `Observe-Key` (required), `Observe-Customer`, `Observe-Feature`, `Observe-Agent`, `Observe-Trace-Id`, `Observe-Span-Id`, `Observe-Parent-Span-Id`. Legacy `x-tanso-*` names are still accepted but deprecated per RFC 6648. Prefer the unprefixed form. --- ## Environment variables ``` OBSERVE_API_KEY=obs_... # get from https://observe.tansohq.com/data-sources ``` The key always starts with `obs_`. It is the ONLY value the user needs to wire up. Do not invent other env vars. Do not replace the user's OpenAI / Anthropic key. --- ## Stripe connection is optional Observe auto-attributes revenue from a connected Stripe account, but it is not required to start tracking costs. A user can: 1. Wire up Observe first (Paths 1, 2, or 3 above) — tracks cost only. 2. Connect Stripe later from Data Sources → Stripe — backfills revenue attribution. 3. Or skip Stripe entirely and provide `revenueAmount` explicitly per event (Path 1) or via the Feature Pricing table in the dashboard. If using a restricted Stripe API key, enable **read-only** access to: Customers, Subscriptions, Products, and Prices. No write permissions needed. If the user is on Clerk, Paddle, Lago, or any other billing tool, they can either connect Stripe separately, or use explicit `revenueAmount` / Feature Pricing. ### How revenue, MRR, and margin work #### Customer lifecycle Customers only appear in Observe once the SDK sends at least one event with their `customerReferenceId`. Stripe-imported customers with zero SDK events are NOT shown on the customers page — their margins would be meaningless without cost data. The intended flow is: 1. **SDK event arrives** with a `customerReferenceId` → customer appears in Observe. 2. **Stripe links via meta** → pass the Stripe customer ID in `meta.stripe_customer_id` to link this customer to Stripe revenue data. The `customerReferenceId` can be any stable app-level ID — Stripe matching uses the meta field, not the reference ID. 3. **Stripe enriches revenue** → the event gets `revenue_amount` stamped at ingest. 4. **Margins become real** — cost from the SDK event, revenue from Stripe data. This is a deliberate design choice. Observe is a cost observability tool — a customer without usage events has no cost data, so showing them with $0/$0 would produce misleading 0% margins and pollute analytics. #### Revenue sources (priority order) Revenue is attributed to each event at ingest time (`server/lib/enrich-revenue.ts`): 1. **Feature Pricing rules** — checked first. If a `feature_pricing` row exists for the event's `feature_key`, that `revenue_per_unit` is used. Most precise. 2. **Subscription data** — if no feature pricing rule, Observe looks up the customer's active subscription(s) and applies revenue based on the pricing model: - `metered`: `unit_price * usage_units` - `tiered`: tier-based unit price (from Stripe pricing tiers) * usage_units - `hybrid`: metered component if available, else falls back to subscription - `flat`: revenue = 0 per event (flat subscriptions don't generate per-event revenue) 3. **Explicit per-event** — `revenueAmount` sent in the SDK event payload overrides all of the above. 4. **Stripe sync** — Stripe sync imports subscription data into the `subscriptions` table. This data is joined at query time for customer-level MRR views. No revenue events are written to `observe_events` from Stripe — this prevents double-counting and fake features. The `revenue_source` field on each event indicates how the number was derived: `feature_pricing`, `per_unit`, `tiered`, `hybrid`, `subscription`, `explicit`, or `none`. Feature-level and model-level analytics only show usage-attributed revenue (metered, tiered, explicit, feature_pricing). Flat subscription MRR cannot be honestly attributed to a specific feature and is only shown at the customer level. #### MRR calculation MRR is computed from active subscriptions: `SUM(COALESCE(mrr_override, plan.price_amount))` across all active subscriptions, grouped by customer. `mrr_override` allows custom pricing that differs from the plan's list price. Plan prices are normalized to monthly (divided by `interval_months` for annual plans). MRR movements (new, expansion, contraction, churned) compare current active subscriptions against subscriptions that existed 60+ days ago. #### Analytics KPI cards The Analytics page (`/analytics`) shows three KPI cards at the top: **When revenue data is connected (Stripe or explicit):** - **Total Revenue** — sum of all customer revenue from events - **Total Cost** — sum of all tracked AI costs (LLM inference, embeddings, API calls) - **Gross Margin** — `(revenue - cost) / revenue * 100`, displayed as a large colored percentage (green ≥70%, amber 30-69%, red <30%). Values >99% but <100% display as ">99%" **When no revenue data exists (cost-only mode):** - **Customers Tracked** — count of distinct customers with events - **Total Cost** — same as above - **Events Tracked** — total event count Each card has an info tooltip (hover the ⓘ icon) explaining what the metric includes. A **data confidence badge** appears next to the page header when event count is low: Very low (<10 events), Low (<50), Medium (<200). Hidden when Good (≥500). #### Pricing model badges Each customer on the Cohorts page shows a pricing model badge derived from their Stripe subscription: Flat, Metered, Tiered, or Hybrid. This is queried from `subscriptions.pricing_model` via a JOIN in the cohorts endpoint. When a customer has multiple active subscriptions with different pricing models, the badge automatically shows "Hybrid" (`COUNT(DISTINCT pricing_model) > 1`). #### Margin calculation Margin is always: `(revenue - cost) / revenue * 100` **Revenue source for margin depends on pricing model:** - **Metered/tiered/hybrid customers**: revenue comes from per-event `revenue_amount` in `observe_events` (enriched at ingest time via `enrich-revenue.ts`) - **Flat subscription customers**: per-event `revenue_amount` is $0 by design (PM-7). The cohorts endpoint falls back to **subscription MRR** from the `subscriptions` table (`subRevenueMap`) when event-level revenue is 0. This ensures flat-sub customers show their actual margin (e.g., $99 MRR - $0.001 cost = ~100% margin) instead of -100%. Feature-level and model-level margins only use event-level revenue (metered, tiered, explicit, feature_pricing). Flat subscription MRR cannot be attributed to individual features — feature margins show as null for flat-sub customers. When revenue is 0 (no subscription AND no event revenue) and cost > 0, margin is -100%. When both are 0, margin is null (not shown). Values >99% but <100% display as ">99%" on the Analytics dashboard to avoid implying zero cost. Margin badges show a tooltip with benchmark context: "AI SaaS target: 60-80% gross margin." --- ## Checklist for an AI coding agent implementing Observe 1. **Detect gateways first.** Grep for `portkey-ai`, `litellm`, `helicone`, `openrouter`, and any custom `baseURL` on OpenAI/Anthropic clients. If found, use Path 2 (gateway headers). Do NOT rip out the gateway. 2. **Default to Path 1 (direct ingest).** No SDK install required. Do NOT install `@tansohq/observe` and do NOT wrap the provider client unless the user explicitly asks for auto-instrumentation. 3. Add `OBSERVE_API_KEY=obs_...` to `.env` (and `.env.example` with a placeholder). The key always starts with `obs_`. Get it from https://observe.tansohq.com/data-sources. 4. Find every place the app calls OpenAI / Anthropic / another LLM provider, or any billable feature (API calls, compute jobs, file processing, etc.). After each call returns, `fetch('https://observe.tansohq.com/api/events/ingest', …)` with a single event in the body: `eventName`, `customerReferenceId`, `featureKey`. For LLM calls, also include `model`, `modelProvider`, `inputTokens`, `outputTokens`. For non-LLM features, include `costAmount` and/or `usageUnits`. Always include `durationMs` and `idempotencyKey`. Fire-and-forget; catch and log on failure. 5. Use the end-user's ID (Stripe customer ID if available, else stable user ID) as `customerReferenceId`. Never "default" or a hardcoded value. 6. Use one stable `featureKey` per user-facing feature — e.g. `ai_chat`, `summarize_email`, `code_autocomplete`, `pdf_export`, `api_request`. Not per model. Track all billable features, not just AI ones. 7. If (and only if) the user asks for auto-instrumentation: `npm install @tansohq/observe` — with the `hq`. The package `@tanso/observe` does NOT exist on npm. If install fails, STOP and tell the user; do not substitute. Then follow Path 3. 8. Do NOT log or echo the API key. Keep it in env vars only. 9. Stripe is optional. Do not add Stripe integration unless the user asks. 10. After wiring up, run the app once and confirm events appear at https://observe.tansohq.com/events. --- ## Tracing multi-step flows If the app chains multiple LLM calls or sub-tasks (e.g., an agent pipeline, RAG with retrieval + generation, or a review with parse + analyze + publish), add `traceId`, `spanId`, and `parentSpanId` to group them into a single trace on the Observe Traces page. - `traceId` — shared across all events in the flow (use a UUID or the first call's request ID) - `spanId` — unique per event within the trace - `parentSpanId` — the `spanId` of the parent step (null for the root) ```ts import { randomUUID } from 'crypto' const traceId = randomUUID() // Step 1: embed the query const embedRes = await openai.embeddings.create({ model: 'text-embedding-3-small', input: query, }) fetch(OBSERVE_URL, { method: 'POST', headers: { Authorization: `Bearer ${OBSERVE_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ events: [{ eventName: 'embed_query', featureKey: 'rag_search', customerReferenceId: user.id, model: 'text-embedding-3-small', modelProvider: 'openai', inputTokens: embedRes.usage.prompt_tokens, outputTokens: 0, traceId, spanId: 'embed', parentSpanId: null, }]}), }).catch(console.error) // Step 2: generate answer (child of embed) const chatRes = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: query }], }) fetch(OBSERVE_URL, { method: 'POST', headers: { Authorization: `Bearer ${OBSERVE_KEY}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ events: [{ eventName: 'generate_answer', featureKey: 'rag_search', customerReferenceId: user.id, model: 'gpt-4o', modelProvider: 'openai', inputTokens: chatRes.usage?.prompt_tokens, outputTokens: chatRes.usage?.completion_tokens, traceId, spanId: 'generate', parentSpanId: 'embed', }]}), }).catch(console.error) ``` The Traces page (`/traces`) shows a waterfall view with per-span cost, duration, and model. Useful for spotting which step in a pipeline is most expensive. For Path 2/3 (gateway), use the request headers instead: `Observe-Trace-Id`, `Observe-Span-Id`, `Observe-Parent-Span-Id`. --- ## Agent Self-Serve API Agents can programmatically create accounts, get API keys, and manage their own access without a human in the loop. No auth required for signup — rate limited to 3 requests per hour per IP. ### Sign up and get a key ```bash curl -X POST https://observe.tansohq.com/api/signup \ -H "Content-Type: application/json" \ -d '{ "email": "agent@yourapp.com", "scopes": ["usage.read", "events.write", "proxy.chat"], "budget_cents": 5000, "budget_period": "month" }' ``` Response: `{ "key": "obs_...", "scopes": [...], "expires_at": null, "budget_cents": 5000, "budget_period": "month" }` | Field | Required | Default | Description | |-------|----------|---------|-------------| | `email` | Yes | — | Account email | | `scopes` | No | `["usage.read", "billing.read"]` | Scopes for this key (see allowed list below) | | `budget_cents` | No | null (unlimited) | Per-key LLM spend cap in cents | | `budget_period` | No | null (lifetime) | `"month"` or `"day"` — resets budget on period boundary | | `expires_in_seconds` | No | null (never) | Key auto-expires after this many seconds. Omit for permanent keys. | Scopes allowed on signup: `proxy.chat`, `usage.read`, `billing.read`, `events.read`, `events.write`, `customers.read`, `recommendations.read`, `models.read`. For elevated scopes (`admin`, `*.write`, `alerts.*`), use `POST /sdk-keys` with a Clerk session. Returns 409 if the email is already registered. ### Check your key's capabilities ```bash curl https://observe.tansohq.com/api/sdk-keys/me \ -H "Authorization: Bearer obs_..." ``` Response: ```json { "auth_type": "sdk_key", "key_prefix": "obs_abc1234", "name": "default", "scopes": ["usage.read", "events.write", "proxy.chat"], "budget_cents": 5000, "budget_used_cents": 120, "budget_remaining_cents": 4880, "budget_period": "month", "budget_reset_at": "2026-06-01T00:00:00.000Z", "expires_at": "2026-05-14T18:00:00.000Z" } ``` ### Check your plan and limits ```bash curl https://observe.tansohq.com/api/plan \ -H "Authorization: Bearer obs_..." ``` Response: ```json { "plan": "free", "name": "Free", "features": { "event_ingest": { "limit": 10000, "usage": 42, "remaining": 9958 }, "ai_insights": { "limit": 1000, "usage": 0, "remaining": 1000 }, "cost_alerts": { "limit": 3, "usage": 0, "remaining": 3 } }, "upgradeUrl": "https://observe.tansohq.com/settings" } ``` ### Available scopes | Scope | Grants access to | |-------|-----------------| | `proxy.chat` | LLM proxy and gateway endpoints | | `events.read` | Read events, traces, aggregations | | `events.write` | Ingest events, manage SDK keys | | `usage.read` | Read features, analytics, simulations | | `usage.write` | Modify feature definitions, run simulations | | `customers.read` | Read customers, subscriptions, cohorts | | `customers.write` | Modify customers, cohorts, upload data | | `billing.read` | Read plan, entitlements, cloud costs | | `billing.write` | Modify cloud cost integrations | | `alerts.read` | Read alert rules and history | | `alerts.write` | Create/modify/delete alert rules | | `recommendations.read` | Read and act on recommendations | | `models.read` | Read model pricing and stats | | `admin` | Full access to all endpoints | Keys with `null` scopes (e.g., keys created before capability keys) have unrestricted access — backwards compatible with existing integrations. ### Budget enforcement Budget is tracked per key on LLM proxy/gateway calls only. Management API calls (reading analytics, managing alerts) are free and unmetered. When a key's `budget_used_cents` reaches `budget_cents`, proxy/gateway calls return 429: ```json { "error": { "message": "Budget exceeded for this API key", "type": "budget_error" } } ``` Budget resets automatically at `budget_reset_at` (lazy reset on next API call). Account plan limits are the ceiling — individual keys subdivide the budget. ### Create additional scoped keys (humans only) Authenticated humans (Clerk JWT) can mint additional keys on their account with different scopes and budgets: ```bash curl -X POST https://observe.tansohq.com/api/sdk-keys \ -H "Authorization: Bearer " \ -H "Content-Type: application/json" \ -d '{ "name": "sales-enrichment", "scopes": ["proxy.chat", "customers.read"], "budget_cents": 4000, "budget_period": "month" }' ``` SDK keys cannot create other keys — only Clerk-authenticated humans can. --- ## Management API (same `obs_*` key) The same API key used for event ingestion also grants access to Observe's full read/write management API. An agent can query analytics, manage alerts, and configure features without opening the dashboard. All endpoints accept `Authorization: Bearer obs_...` or the `Observe-Key` header. Scoped keys are restricted to endpoints matching their scopes (see scope table above). ### Analytics & querying | Method | Endpoint | What it returns | |--------|----------|----------------| | GET | `/api/events` | Paginated event list (filterable by feature, model, customer, date) | | GET | `/api/events/:id` | Single event detail | | GET | `/api/events/by-feature` | Cost/revenue/usage aggregated by feature | | GET | `/api/events/by-model` | Cost/usage aggregated by model | | GET | `/api/events/by-customer` | Cost/usage aggregated by customer | | GET | `/api/events/by-agent` | Cost/usage aggregated by agent | | GET | `/api/events/by-cost-type` | Cost/usage aggregated by cost type | | GET | `/api/events/traces` | Paginated trace list | | GET | `/api/events/trace/:traceId` | Full trace with spans | | GET | `/api/features` | All features with cost/revenue/margin stats | | GET | `/api/features/:key` | Feature detail (events, customers, models, timeseries) | | GET | `/api/models` | All models with cost stats | | GET | `/api/customers` | All customers with cost/margin data | | GET | `/api/customers/:id` | Customer detail with events, margins, models | | GET | `/api/analytics/overview` | Dashboard KPIs (revenue, cost, margin) | | GET | `/api/analytics/trends` | Cost/usage trends over time | | GET | `/api/analytics/daily-summary` | Daily cost/usage summary | | GET | `/api/cohorts` | Customer cohorts with health scores | | POST | `/a2a/query` | Structured cost queries (cost_query, usage_summary, margin_analysis, trace_query) | ### Alert management | Method | Endpoint | What it does | |--------|----------|-------------| | GET | `/api/alerts` | List all alert rules | | POST | `/api/alerts` | Create an alert rule | | PATCH | `/api/alerts/:id` | Update an alert rule | | DELETE | `/api/alerts/:id` | Delete an alert rule | | POST | `/api/alerts/:id/test` | Send a test alert | | GET | `/api/alerts/history` | Alert firing history | Alert creation example: ```bash curl -X POST https://observe.tansohq.com/api/alerts \ -H "Authorization: Bearer $OBSERVE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "Daily cost spike", "metric": "daily_cost", "operator": "gt", "threshold": 100, "email": "alerts@yourcompany.com", "cooldown_minutes": 60 }' ``` Supported metrics: `daily_cost`, `margin_percent`, `customer_margin`, `customer_concentration`. ### Feature configuration | Method | Endpoint | What it does | |--------|----------|-------------| | GET | `/api/feature-definitions` | List feature definitions | | POST | `/api/feature-definitions` | Create a feature definition | | PATCH | `/api/feature-definitions/:id` | Update a feature definition | | DELETE | `/api/feature-definitions/:id` | Delete a feature definition | | GET | `/api/feature-pricing` | List feature pricing rules | | POST | `/api/feature-pricing` | Create a pricing rule (revenue_per_unit) | | DELETE | `/api/feature-pricing/:featureKey` | Delete a pricing rule | ### Recommendations | Method | Endpoint | What it does | |--------|----------|-------------| | GET | `/api/recommendations` | List optimization recommendations | | POST | `/api/recommendations/compute` | Trigger recommendation engine | | POST | `/api/recommendations/:id/apply` | Accept a recommendation | | POST | `/api/recommendations/:id/dismiss` | Dismiss a recommendation | ### Agent discovery | Method | Endpoint | What it returns | |--------|----------|----------------| | GET | `/.well-known/agent.json` | Agent card with all capabilities and endpoints | --- ## What NOT to do - Do not install `@tanso/observe`. The correct package is `@tansohq/observe`. - Do not replace OpenAI / Anthropic with anything else. Observe wraps the existing client, it does not substitute models. - Observe logs cost + metadata by default. Optionally include `prompt` / `completion` in `properties` for debugging visibility — skip if content is sensitive or subject to privacy requirements. - Do not hardcode the API key. - Do not touch an existing AI gateway without asking the user. If you see Portkey, LiteLLM, Helicone, OpenRouter, or similar in `package.json`, use Path 2. - Do not default to the SDK wrap (Path 3). It puts Observe in the critical path of every LLM call. Use Path 1 (direct ingest) unless the user explicitly asks for auto-instrumentation. - Do not add Stripe integration as part of Observe setup unless the user asks. Stripe is optional. --- ## FAQ — Revenue Attribution ### How does revenue get attributed to events? Three methods, in priority order: 1. **Explicit `revenueAmount`** — pass it on the SDK event. Overrides everything. Best when: the developer knows the exact revenue for that API call. 2. **Feature pricing rules** — set `revenue_per_unit` per feature in the Observe dashboard (/features page). Every event with that `featureKey` gets stamped automatically. Best when: pricing is per-feature (e.g., $0.01 per chat message). 3. **Stripe subscription match** — connect Stripe, sync customers/subscriptions. Observe detects the pricing model and calculates revenue per event: - **Metered**: `unitPrice × usageUnits` → exact per-event revenue - **Tiered**: `tierPrice(monthToDateUsage) × usageUnits` → tier-aware revenue - **Flat**: $0 per event — MRR is tracked at the customer level, not per-event - **Hybrid**: metered component calculated, flat component at customer level ### Where does revenue show in the UI? | Level | What shows | Source | |-------|-----------|--------| | **Event** | `revenue_amount` | Metered/tiered/explicit/feature pricing only. Flat = $0. | | **Feature** | Sum of event revenue | Only meaningful for metered/tiered/feature pricing. | | **Customer** | MRR from subscription | Always shows — pulled from `subscriptions` table. | | **Cohorts** | MRR column + cost | MRR from subscription, cost from events. | ### Why is revenue $0 on my events? Most likely: your customers have **flat-rate subscriptions**. Flat subscriptions contribute $0 per event because MRR is fixed regardless of usage. This is by design — revenue shows at the customer level (MRR column on Cohorts page). To get per-event revenue, either: - Switch to metered/tiered Stripe pricing - Set feature pricing rules in the dashboard - Pass explicit `revenueAmount` on each SDK event ### How do I connect Stripe for revenue? 1. Go to Data Sources → Stripe → Connect 2. Paste a Stripe API key (restricted key with read access to customers, subscriptions, and prices) 3. Click Sync — pulls customers, subscriptions, plans 4. Revenue enrichment runs automatically on new events AND backfills existing ones 5. Pass `meta.stripe_customer_id` on SDK events if your `customerReferenceId` differs from the Stripe `cus_*` ID ### What's the difference between cost and revenue? - **Cost** = what YOU pay the LLM provider (auto-calculated from model + tokens) - **Revenue** = what YOUR CUSTOMER pays you (from Stripe subscription or explicit) - **Margin** = (revenue - cost) / revenue — for flat subscriptions, the cohorts endpoint uses subscription MRR as revenue (not per-event $0). Feature-level margin is null for flat subs since MRR can't be attributed to individual features. - **Period** = margin is scoped to a billing period (This Month by default). Switch between This Month, Last Month, and All Time on the Customer Detail and Cohorts pages. MRR is compared against cost within that period. **Q: Why does margin change when I switch periods?** Margin is calculated as MRR vs cost within the selected billing period. "This Month" shows cost accumulated so far this month against your monthly MRR. "All Time" shows lifetime cost against MRR — which can look worse because cost accumulates over months but MRR is a single month's revenue. **Q: My customer has multiple subscriptions with different pricing models. How does Observe detect "Hybrid"?** If a customer has both a flat and a metered subscription (or any mix of distinct pricing models), Observe automatically labels them "Hybrid". The cohorts query uses `COUNT(DISTINCT pricing_model) > 1` to detect this. Single-model customers show their actual model (Flat, Metered, or Tiered). **Q: Why does my event show $0 cost when I sent tokens?** Very short messages (under ~100 tokens) with cheap models like gpt-4o-mini can produce costs below $0.0001 — e.g., 13 input tokens at $0.15/M = $0.000002. Observe shows these as "<$0.0001" in the UI. The cost IS calculated, it's just too small to display at 4 decimal places. Send a longer message or check the event detail view for the exact input/output cost split.