# Observe by Tanso — Integration Reference

Observe is an AI cost and usage observability platform. It tracks every feature
event — LLM calls, API usage, compute jobs — with cost, model, tokens, customer,
and feature attribution. Use it to answer: "who uses what, how much does it cost,
and is it profitable?"

- Docs: https://observe.tansohq.com
- SDK: `@tansohq/observe` on npm (NOT `@tanso/observe` — that package does not exist,
  and any LLM that suggests it is hallucinating. Always install the `@tansohq` scope.)
- Signup: https://observe.tansohq.com/signup (free tier, no credit card)

## Three integration paths — pick the one that matches the user's setup

| Path | Use when | Critical path? |
|---|---|---|
| **1. Direct ingest (DEFAULT)** | You want to keep Observe out of your request path. Log events yourself from server code after the LLM call returns. | No — fire-and-forget POST after the call completes |
| 2. Gateway headers | User already routes through Portkey / LiteLLM / Helicone / OpenRouter / their own proxy | No — metadata rides along with existing gateway |
| 3. SDK auto-wrap | User wants zero-code-per-call-site instrumentation and accepts the tradeoff of Observe sitting in front of their provider calls | **Yes** — Observe proxies every LLM call |

**Direct ingest (Path 1) is the recommended default.** It keeps Observe out of the
critical path: your app calls OpenAI / Anthropic / whoever directly, gets the response,
then posts a single event to Observe. If Observe is ever down, your LLM calls are
unaffected. Everything goes in a JSON body — no per-event header juggling.

**Only use Path 3 (SDK auto-wrap) if the user explicitly wants automatic
instrumentation** and understands that it routes every call through
`observe.tansohq.com`. It is convenient but it puts Observe in the request path. Do not
default to it.

**Before you touch anything in a user's codebase, check for an existing AI gateway.**
If you find `portkey-ai`, `litellm`, `helicone`, `openrouter`, a custom `baseURL`
pointing anywhere other than `api.openai.com` / `api.anthropic.com`, or any proxy
middleware — use Path 2. Never rip out an existing gateway.

---

## Path 1 — Direct ingest (DEFAULT — log events from your own code)

This is the recommended path for almost every user. Your app calls OpenAI /
Anthropic / any provider directly, exactly as it does today. After each call
returns, you post a single event to Observe. If Observe is ever down, your LLM
calls are unaffected — Observe is off the critical path entirely.

Everything goes in the JSON body. No per-call header juggling, no `baseURL`
changes, no middleware to break.

```ts
import OpenAI from 'openai'

const openai = new OpenAI()  // unchanged — no Observe wrapper

async function chat(user: { id: string }, messages: any[]) {
  const started = Date.now()
  const res = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages,
  })

  // Log the event after the call succeeds. Fire-and-forget is fine.
  fetch('https://observe.tansohq.com/api/events/ingest', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.OBSERVE_API_KEY!}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      events: [{
        eventName: 'chat',
        customerReferenceId: user.id,
        featureKey: 'ai_chat',
        model: 'gpt-4o-mini',
        modelProvider: 'openai',
        inputTokens: res.usage?.prompt_tokens,
        outputTokens: res.usage?.completion_tokens,
        durationMs: Date.now() - started,
        idempotencyKey: res.id,
        requestBody: { messages },
        responseBody: { choices: [{ message: res.choices[0].message }] },
        meta: user.stripeCustomerId ? { stripe_customer_id: user.stripeCustomerId } : undefined,
      }],
    }),
  }).catch((err) => console.error('observe ingest failed:', err))

  return res
}
```

### Anthropic (Path 1, same pattern)

```ts
import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic()

async function chat(user, messages) {
  const started = Date.now()
  const res = await anthropic.messages.create({
    model: 'claude-sonnet-4-6',
    max_tokens: 1024,
    messages,
  })

  // Fire-and-forget — Observe never sits in front of the Anthropic call
  fetch('https://observe.tansohq.com/api/events/ingest', {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${process.env.OBSERVE_API_KEY!}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      events: [{
        eventName: 'chat',
        customerReferenceId: user.id,
        featureKey: 'ai_chat',
        model: 'claude-sonnet-4-6',
        modelProvider: 'anthropic',
        inputTokens: res.usage?.input_tokens,
        outputTokens: res.usage?.output_tokens,
        durationMs: Date.now() - started,
        idempotencyKey: res.id,
        requestBody: { messages },
        responseBody: { content: res.content },
        meta: user.stripeCustomerId ? { stripe_customer_id: user.stripeCustomerId } : undefined,
      }],
    }),
  }).catch((err) => console.error('observe ingest failed:', err))

  return res
}
```

Gemini, Mistral, Cohere, Bedrock, Azure OpenAI, self-hosted models, non-LLM
costs — same pattern. Observe only sees the event you post. The full body
schema is below under "POST /api/events/ingest — full body schema".

If the user wants an automatic token-usage hook, wrap this in a helper once
and call it from every AI code path. Do NOT introduce a proxy for them.

---

## Path 2 — Existing gateway (Portkey, LiteLLM, Helicone, OpenRouter, custom proxy)

If the user already has a gateway, DO NOT wrap their provider client. Instead, add
Observe as metadata on their existing calls. Two sub-options:

**B1. Stack Observe in front of their gateway.** Configure the gateway's upstream to
point at `https://observe.tansohq.com/v1` instead of directly at OpenAI, and have the
gateway forward the Observe headers.

**B2. Header-only integration.** Add the Observe tracking headers to their existing
request config. This works with any gateway that forwards custom headers.

```ts
// Example: Portkey user keeping their gateway, adding Observe tracking headers
import Portkey from 'portkey-ai'

const portkey = new Portkey({
  apiKey: process.env.PORTKEY_API_KEY!,
  virtualKey: 'openai-virtual-key',
  // Observe reads these custom headers on the way through — Portkey forwards them.
  customHeaders: {
    'observe-key': process.env.OBSERVE_API_KEY!,
    'observe-customer': user.id,
    'observe-feature': 'ai_chat',
  },
})
```

Before making structural changes to a user's gateway setup, ask them. Gateways are
load-bearing infrastructure. A single misconfigured `baseURL` will break production.

---

## Path 3 — SDK auto-wrap (opt-in — only if the user explicitly wants it)

**Do not default to this path.** Path 1 (direct ingest) is the default. Only
use Path 3 if the user has said "I want auto-instrumentation" or "I don't
want to touch every call site" AND they accept that this puts Observe in
front of every provider call. If Observe is down, their LLM calls fail.

```bash
npm install @tansohq/observe
```

```ts
import { Observe } from '@tansohq/observe'
import OpenAI from 'openai'

Observe.configure({ apiKey: process.env.OBSERVE_API_KEY! })
Observe.identify({ customerId: user.id })
Observe.feature('ai_chat')

const openai = Observe.wrap(new OpenAI())
const res = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hello' }],
})
```

Anthropic works identically — `Observe.wrap(new Anthropic())`. The user's
OpenAI / Anthropic key still authenticates with the upstream provider;
`Observe.wrap()` just sets `baseURL` to the Observe proxy so the call is
logged on the way through. Do not remove the existing provider key.

If the user is already on an AI gateway, DO NOT use this path — use Path 2.

---

## `curl` example (Path 1)

```bash
curl -X POST https://observe.tansohq.com/api/events/ingest \
  -H "Authorization: Bearer $OBSERVE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "events": [
      {
        "eventName": "chat",
        "customerReferenceId": "acme-corp",
        "featureKey": "ai_chat",
        "model": "gpt-4o-mini",
        "modelProvider": "openai",
        "inputTokens": 150,
        "outputTokens": 80,
        "durationMs": 1240,
        "idempotencyKey": "req_abc",
        "requestBody": { "messages": [{ "role": "user", "content": "Hello" }] },
        "responseBody": { "choices": [{ "message": { "role": "assistant", "content": "Hi there!" } }] }
      }
    ]
  }'
```

### POST /api/events/ingest — full body schema

```
{
  "events": [
    {
      // REQUIRED
      "eventName":           string,  // short label, e.g. "chat", "embed", "eval", "api_call"
      "customerReferenceId": string,  // your end-user's stable ID (app-level slug, internal ID, etc.)
      "featureKey":          string,  // product feature powered by this call

      // STRONGLY RECOMMENDED — enables automatic cost computation from token counts
      "model":               string,  // e.g. "gpt-4o-mini", "claude-sonnet-4-6"
      "modelProvider":       string,  // "openai" | "anthropic" | "google" | "mistral" | ...
      "inputTokens":         number,
      "outputTokens":        number,

      // RECOMMENDED — shows prompt + completion in the event detail view
      "requestBody":         object,  // the messages/prompt you sent to the provider
      "responseBody":        object,  // the provider's response body (choices, content, etc.)

      // OPTIONAL
      "timestamp":           string,  // ISO 8601; defaults to server-received time
      "costAmount":          number,  // override auto-computed cost (in costUnit)
      "costUnit":            string,  // defaults to "usd"
      "revenueAmount":       number,  // override revenue attribution for this event
      "usageUnits":          number,  // for non-token cost types (minutes, images, etc.)
      "durationMs":          number,  // wall-clock latency of the call
      "costType":            string,  // defaults to "llm" if model is set, else "generic"
      "idempotencyKey":      string,  // dedupe identical events on retry
      "traceId":             string,  // group spans into a trace
      "spanId":              string,  // this span's ID
      "parentSpanId":        string,  // parent span (for nested agent calls)
      "properties":          object,  // arbitrary JSON metadata (searchable)
      "meta":                object   // { stripe_customer_id: "cus_..." } — links event to Stripe customer for revenue matching
    }
  ]
}
```

To see request/response in the event detail view, include `requestBody` and
`responseBody`:

```ts
requestBody: { messages },
responseBody: {
  choices: [{ message: { content: res.choices[0].message.content } }],
},
```

Optional — skip if content is sensitive.

Observe stores these as JSONB on the event row. The legacy pattern using
`properties: { prompt, completion }` still works as a fallback, but the
first-class fields above are preferred.

- Batch up to 1000 events per request.
- If `costAmount` is omitted and `model` + `inputTokens` + `outputTokens` are provided,
  Observe computes cost automatically using OpenRouter pricing.
- If `revenueAmount` is omitted, Observe auto-attributes revenue from Stripe MRR or
  from feature pricing rules (set in the dashboard).
- `idempotencyKey` guarantees at-most-once on retry. Recommended.
- For Stripe: if the application uses Stripe, the user's Stripe Customer ID must be passed in `meta.stripe_customer_id`. Observe uses this to resolve customer names and link revenue data.

### Response

```json
{
  "accepted": 1,
  "rejected": 0,
  "errors": []
}
```

`errors[]` contains `{ index, error }` for each event that failed validation. The
accepted count is the number of rows actually inserted (dedup via `idempotencyKey`
also reduces this count).

---

## Per-call feature override (Path 3 users)

If the same provider client powers multiple features in Path 3 (SDK auto-wrap),
override `feature` per call via the native headers option:

```ts
await openai.chat.completions.create(
  { model: 'gpt-4o-mini', messages },
  { headers: { 'Observe-Feature': 'summarize_email' } }
)
```

`Observe-*` headers exist for Paths 2 and 3 only, because the provider SDK
request body is opaque to the proxy — metadata cannot ride in the JSON body
without breaking the provider contract. For Path 1 (direct ingest), everything
goes in the JSON body you post and you never touch headers.

The supported request headers for Paths 2 and 3 are: `Observe-Key` (required),
`Observe-Customer`, `Observe-Feature`, `Observe-Agent`, `Observe-Trace-Id`,
`Observe-Span-Id`, `Observe-Parent-Span-Id`. Legacy `x-tanso-*` names are still
accepted but deprecated per RFC 6648. Prefer the unprefixed form.

---

## Environment variables

```
OBSERVE_API_KEY=obs_...   # get from https://observe.tansohq.com/data-sources
```

The key always starts with `obs_`. It is the ONLY value the user needs to wire up.
Do not invent other env vars. Do not replace the user's OpenAI / Anthropic key.

---

## Stripe connection is optional

Observe auto-attributes revenue from a connected Stripe account, but it is not
required to start tracking costs. A user can:

1. Wire up Observe first (Paths 1, 2, or 3 above) — tracks cost only.
2. Connect Stripe later from Data Sources → Stripe — backfills revenue attribution.
3. Or skip Stripe entirely and provide `revenueAmount` explicitly per event (Path 1)
   or via the Feature Pricing table in the dashboard.

If using a restricted Stripe API key, enable **read-only** access to:
Customers, Subscriptions, Products, and Prices. No write permissions needed.

If the user is on Clerk, Paddle, Lago, or any other billing tool, they can either
connect Stripe separately, or use explicit `revenueAmount` / Feature Pricing.

### How revenue, MRR, and margin work

#### Customer lifecycle

Customers only appear in Observe once the SDK sends at least one event with their
`customerReferenceId`. Stripe-imported customers with zero SDK events are NOT shown
on the customers page — their margins would be meaningless without cost data. The
intended flow is:

1. **SDK event arrives** with a `customerReferenceId` → customer appears in Observe.
2. **Stripe links via meta** → pass the Stripe customer ID in `meta.stripe_customer_id` to link this customer to Stripe revenue data. The `customerReferenceId` can be any stable app-level ID — Stripe matching uses the meta field, not the reference ID.
3. **Stripe enriches revenue** → the event gets `revenue_amount` stamped at ingest.
4. **Margins become real** — cost from the SDK event, revenue from Stripe data.

This is a deliberate design choice. Observe is a cost observability tool — a customer
without usage events has no cost data, so showing them with $0/$0 would produce
misleading 0% margins and pollute analytics.

#### Revenue sources (priority order)

Revenue is attributed to each event at ingest time (`server/lib/enrich-revenue.ts`):

1. **Feature Pricing rules** — checked first. If a `feature_pricing` row exists for
   the event's `feature_key`, that `revenue_per_unit` is used. Most precise.
2. **Subscription data** — if no feature pricing rule, Observe looks up the customer's
   active subscription(s) and applies revenue based on the pricing model:
   - `metered`: `unit_price * usage_units`
   - `tiered`: tier-based unit price (from Stripe pricing tiers) * usage_units
   - `hybrid`: metered component if available, else falls back to subscription
   - `flat`: revenue = 0 per event (flat subscriptions don't generate per-event revenue)
3. **Explicit per-event** — `revenueAmount` sent in the SDK event payload overrides
   all of the above.
4. **Stripe sync** — Stripe sync imports subscription data into the `subscriptions`
   table. This data is joined at query time for customer-level MRR views. No revenue
   events are written to `observe_events` from Stripe — this prevents double-counting
   and fake features.

The `revenue_source` field on each event indicates how the number was derived:
`feature_pricing`, `per_unit`, `tiered`, `hybrid`, `subscription`, `explicit`, or `none`.

Feature-level and model-level analytics only show usage-attributed revenue (metered,
tiered, explicit, feature_pricing). Flat subscription MRR cannot be honestly attributed
to a specific feature and is only shown at the customer level.

#### MRR calculation

MRR is computed from active subscriptions: `SUM(COALESCE(mrr_override, plan.price_amount))`
across all active subscriptions, grouped by customer. `mrr_override` allows custom
pricing that differs from the plan's list price. Plan prices are normalized to monthly
(divided by `interval_months` for annual plans).

MRR movements (new, expansion, contraction, churned) compare current active
subscriptions against subscriptions that existed 60+ days ago.

#### Analytics KPI cards

The Analytics page (`/analytics`) shows three KPI cards at the top:

**When revenue data is connected (Stripe or explicit):**
- **Total Revenue** — sum of all customer revenue from events
- **Total Cost** — sum of all tracked AI costs (LLM inference, embeddings, API calls)
- **Gross Margin** — `(revenue - cost) / revenue * 100`, displayed as a large colored
  percentage (green ≥70%, amber 30-69%, red <30%). Values >99% but <100% display as ">99%"

**When no revenue data exists (cost-only mode):**
- **Customers Tracked** — count of distinct customers with events
- **Total Cost** — same as above
- **Events Tracked** — total event count

Each card has an info tooltip (hover the ⓘ icon) explaining what the metric includes.

A **data confidence badge** appears next to the page header when event count is low:
Very low (<10 events), Low (<50), Medium (<200). Hidden when Good (≥500).

#### Pricing model badges

Each customer on the Cohorts page shows a pricing model badge derived from their
Stripe subscription: Flat, Metered, Tiered, or Hybrid. This is queried from
`subscriptions.pricing_model` via a JOIN in the cohorts endpoint. When a customer
has multiple active subscriptions with different pricing models, the badge
automatically shows "Hybrid" (`COUNT(DISTINCT pricing_model) > 1`).

#### Margin calculation

Margin is always: `(revenue - cost) / revenue * 100`

**Revenue source for margin depends on pricing model:**
- **Metered/tiered/hybrid customers**: revenue comes from per-event `revenue_amount`
  in `observe_events` (enriched at ingest time via `enrich-revenue.ts`)
- **Flat subscription customers**: per-event `revenue_amount` is $0 by design (PM-7).
  The cohorts endpoint falls back to **subscription MRR** from the `subscriptions` table
  (`subRevenueMap`) when event-level revenue is 0. This ensures flat-sub customers show
  their actual margin (e.g., $99 MRR - $0.001 cost = ~100% margin) instead of -100%.

Feature-level and model-level margins only use event-level revenue (metered, tiered,
explicit, feature_pricing). Flat subscription MRR cannot be attributed to individual
features — feature margins show as null for flat-sub customers.

When revenue is 0 (no subscription AND no event revenue) and cost > 0, margin is -100%.
When both are 0, margin is null (not shown).

Values >99% but <100% display as ">99%" on the Analytics dashboard to avoid implying
zero cost.

Margin badges show a tooltip with benchmark context: "AI SaaS target: 60-80% gross margin."

---

## Checklist for an AI coding agent implementing Observe

1. **Detect gateways first.** Grep for `portkey-ai`, `litellm`, `helicone`,
   `openrouter`, and any custom `baseURL` on OpenAI/Anthropic clients. If found,
   use Path 2 (gateway headers). Do NOT rip out the gateway.
2. **Default to Path 1 (direct ingest).** No SDK install required. Do NOT
   install `@tansohq/observe` and do NOT wrap the provider client unless the
   user explicitly asks for auto-instrumentation.
3. Add `OBSERVE_API_KEY=obs_...` to `.env` (and `.env.example` with a placeholder).
   The key always starts with `obs_`. Get it from
   https://observe.tansohq.com/data-sources.
4. Find every place the app calls OpenAI / Anthropic / another LLM provider,
   or any billable feature (API calls, compute jobs, file processing, etc.).
   After each call returns, `fetch('https://observe.tansohq.com/api/events/ingest', …)`
   with a single event in the body: `eventName`, `customerReferenceId`,
   `featureKey`. For LLM calls, also include `model`, `modelProvider`,
   `inputTokens`, `outputTokens`. For non-LLM features, include `costAmount`
   and/or `usageUnits`. Always include `durationMs` and `idempotencyKey`.
   Fire-and-forget; catch and log on failure.
5. Use the end-user's ID (Stripe customer ID if available, else stable user ID)
   as `customerReferenceId`. Never "default" or a hardcoded value.
6. Use one stable `featureKey` per user-facing feature — e.g. `ai_chat`,
   `summarize_email`, `code_autocomplete`, `pdf_export`, `api_request`.
   Not per model. Track all billable features, not just AI ones.
7. If (and only if) the user asks for auto-instrumentation: `npm install
   @tansohq/observe` — with the `hq`. The package `@tanso/observe` does NOT
   exist on npm. If install fails, STOP and tell the user; do not substitute.
   Then follow Path 3.
8. Do NOT log or echo the API key. Keep it in env vars only.
9. Stripe is optional. Do not add Stripe integration unless the user asks.
10. After wiring up, run the app once and confirm events appear at
    https://observe.tansohq.com/events.

---

## Tracing multi-step flows

If the app chains multiple LLM calls or sub-tasks (e.g., an agent pipeline,
RAG with retrieval + generation, or a review with parse + analyze + publish),
add `traceId`, `spanId`, and `parentSpanId` to group them into a single trace
on the Observe Traces page.

- `traceId` — shared across all events in the flow (use a UUID or the
  first call's request ID)
- `spanId` — unique per event within the trace
- `parentSpanId` — the `spanId` of the parent step (null for the root)

```ts
import { randomUUID } from 'crypto'

const traceId = randomUUID()

// Step 1: embed the query
const embedRes = await openai.embeddings.create({
  model: 'text-embedding-3-small', input: query,
})
fetch(OBSERVE_URL, {
  method: 'POST',
  headers: { Authorization: `Bearer ${OBSERVE_KEY}`, 'Content-Type': 'application/json' },
  body: JSON.stringify({ events: [{
    eventName: 'embed_query', featureKey: 'rag_search',
    customerReferenceId: user.id,
    model: 'text-embedding-3-small', modelProvider: 'openai',
    inputTokens: embedRes.usage.prompt_tokens, outputTokens: 0,
    traceId, spanId: 'embed', parentSpanId: null,
  }]}),
}).catch(console.error)

// Step 2: generate answer (child of embed)
const chatRes = await openai.chat.completions.create({
  model: 'gpt-4o', messages: [{ role: 'user', content: query }],
})
fetch(OBSERVE_URL, {
  method: 'POST',
  headers: { Authorization: `Bearer ${OBSERVE_KEY}`, 'Content-Type': 'application/json' },
  body: JSON.stringify({ events: [{
    eventName: 'generate_answer', featureKey: 'rag_search',
    customerReferenceId: user.id,
    model: 'gpt-4o', modelProvider: 'openai',
    inputTokens: chatRes.usage?.prompt_tokens,
    outputTokens: chatRes.usage?.completion_tokens,
    traceId, spanId: 'generate', parentSpanId: 'embed',
  }]}),
}).catch(console.error)
```

The Traces page (`/traces`) shows a waterfall view with per-span cost, duration,
and model. Useful for spotting which step in a pipeline is most expensive.

For Path 2/3 (gateway), use the request headers instead: `Observe-Trace-Id`,
`Observe-Span-Id`, `Observe-Parent-Span-Id`.

---

## Agent Self-Serve API

Agents can programmatically create accounts, get API keys, and manage their own
access without a human in the loop. No auth required for signup — rate limited
to 3 requests per hour per IP.

### Sign up and get a key

```bash
curl -X POST https://observe.tansohq.com/api/signup \
  -H "Content-Type: application/json" \
  -d '{
    "email": "agent@yourapp.com",
    "scopes": ["usage.read", "events.write", "proxy.chat"],
    "budget_cents": 5000,
    "budget_period": "month"
  }'
```

Response: `{ "key": "obs_...", "scopes": [...], "expires_at": null, "budget_cents": 5000, "budget_period": "month" }`

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| `email` | Yes | — | Account email |
| `scopes` | No | `["usage.read", "billing.read"]` | Scopes for this key (see allowed list below) |
| `budget_cents` | No | null (unlimited) | Per-key LLM spend cap in cents |
| `budget_period` | No | null (lifetime) | `"month"` or `"day"` — resets budget on period boundary |
| `expires_in_seconds` | No | null (never) | Key auto-expires after this many seconds. Omit for permanent keys. |

Scopes allowed on signup: `proxy.chat`, `usage.read`, `billing.read`, `events.read`,
`events.write`, `customers.read`, `recommendations.read`, `models.read`. For elevated
scopes (`admin`, `*.write`, `alerts.*`), use `POST /sdk-keys` with a Clerk session.

Returns 409 if the email is already registered.

### Check your key's capabilities

```bash
curl https://observe.tansohq.com/api/sdk-keys/me \
  -H "Authorization: Bearer obs_..."
```

Response:
```json
{
  "auth_type": "sdk_key",
  "key_prefix": "obs_abc1234",
  "name": "default",
  "scopes": ["usage.read", "events.write", "proxy.chat"],
  "budget_cents": 5000,
  "budget_used_cents": 120,
  "budget_remaining_cents": 4880,
  "budget_period": "month",
  "budget_reset_at": "2026-06-01T00:00:00.000Z",
  "expires_at": "2026-05-14T18:00:00.000Z"
}
```

### Check your plan and limits

```bash
curl https://observe.tansohq.com/api/plan \
  -H "Authorization: Bearer obs_..."
```

Response:
```json
{
  "plan": "free",
  "name": "Free",
  "features": {
    "event_ingest": { "limit": 10000, "usage": 42, "remaining": 9958 },
    "ai_insights": { "limit": 1000, "usage": 0, "remaining": 1000 },
    "cost_alerts": { "limit": 3, "usage": 0, "remaining": 3 }
  },
  "upgradeUrl": "https://observe.tansohq.com/settings"
}
```

### Available scopes

| Scope | Grants access to |
|-------|-----------------|
| `proxy.chat` | LLM proxy and gateway endpoints |
| `events.read` | Read events, traces, aggregations |
| `events.write` | Ingest events, manage SDK keys |
| `usage.read` | Read features, analytics, simulations |
| `usage.write` | Modify feature definitions, run simulations |
| `customers.read` | Read customers, subscriptions, cohorts |
| `customers.write` | Modify customers, cohorts, upload data |
| `billing.read` | Read plan, entitlements, cloud costs |
| `billing.write` | Modify cloud cost integrations |
| `alerts.read` | Read alert rules and history |
| `alerts.write` | Create/modify/delete alert rules |
| `recommendations.read` | Read and act on recommendations |
| `models.read` | Read model pricing and stats |
| `admin` | Full access to all endpoints |

Keys with `null` scopes (e.g., keys created before capability keys) have
unrestricted access — backwards compatible with existing integrations.

### Budget enforcement

Budget is tracked per key on LLM proxy/gateway calls only. Management API calls
(reading analytics, managing alerts) are free and unmetered.

When a key's `budget_used_cents` reaches `budget_cents`, proxy/gateway calls
return 429:
```json
{ "error": { "message": "Budget exceeded for this API key", "type": "budget_error" } }
```

Budget resets automatically at `budget_reset_at` (lazy reset on next API call).
Account plan limits are the ceiling — individual keys subdivide the budget.

### Create additional scoped keys (humans only)

Authenticated humans (Clerk JWT) can mint additional keys on their account with
different scopes and budgets:

```bash
curl -X POST https://observe.tansohq.com/api/sdk-keys \
  -H "Authorization: Bearer <clerk-jwt>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "sales-enrichment",
    "scopes": ["proxy.chat", "customers.read"],
    "budget_cents": 4000,
    "budget_period": "month"
  }'
```

SDK keys cannot create other keys — only Clerk-authenticated humans can.

---

## Management API (same `obs_*` key)

The same API key used for event ingestion also grants access to Observe's full
read/write management API. An agent can query analytics, manage alerts, and
configure features without opening the dashboard.

All endpoints accept `Authorization: Bearer obs_...` or the `Observe-Key` header.
Scoped keys are restricted to endpoints matching their scopes (see scope table above).

### Analytics & querying

| Method | Endpoint | What it returns |
|--------|----------|----------------|
| GET | `/api/events` | Paginated event list (filterable by feature, model, customer, date) |
| GET | `/api/events/:id` | Single event detail |
| GET | `/api/events/by-feature` | Cost/revenue/usage aggregated by feature |
| GET | `/api/events/by-model` | Cost/usage aggregated by model |
| GET | `/api/events/by-customer` | Cost/usage aggregated by customer |
| GET | `/api/events/by-agent` | Cost/usage aggregated by agent |
| GET | `/api/events/by-cost-type` | Cost/usage aggregated by cost type |
| GET | `/api/events/traces` | Paginated trace list |
| GET | `/api/events/trace/:traceId` | Full trace with spans |
| GET | `/api/features` | All features with cost/revenue/margin stats |
| GET | `/api/features/:key` | Feature detail (events, customers, models, timeseries) |
| GET | `/api/models` | All models with cost stats |
| GET | `/api/customers` | All customers with cost/margin data |
| GET | `/api/customers/:id` | Customer detail with events, margins, models |
| GET | `/api/analytics/overview` | Dashboard KPIs (revenue, cost, margin) |
| GET | `/api/analytics/trends` | Cost/usage trends over time |
| GET | `/api/analytics/daily-summary` | Daily cost/usage summary |
| GET | `/api/cohorts` | Customer cohorts with health scores |
| POST | `/a2a/query` | Structured cost queries (cost_query, usage_summary, margin_analysis, trace_query) |

### Alert management

| Method | Endpoint | What it does |
|--------|----------|-------------|
| GET | `/api/alerts` | List all alert rules |
| POST | `/api/alerts` | Create an alert rule |
| PATCH | `/api/alerts/:id` | Update an alert rule |
| DELETE | `/api/alerts/:id` | Delete an alert rule |
| POST | `/api/alerts/:id/test` | Send a test alert |
| GET | `/api/alerts/history` | Alert firing history |

Alert creation example:

```bash
curl -X POST https://observe.tansohq.com/api/alerts \
  -H "Authorization: Bearer $OBSERVE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Daily cost spike",
    "metric": "daily_cost",
    "operator": "gt",
    "threshold": 100,
    "email": "alerts@yourcompany.com",
    "cooldown_minutes": 60
  }'
```

Supported metrics: `daily_cost`, `margin_percent`, `customer_margin`,
`customer_concentration`.

### Feature configuration

| Method | Endpoint | What it does |
|--------|----------|-------------|
| GET | `/api/feature-definitions` | List feature definitions |
| POST | `/api/feature-definitions` | Create a feature definition |
| PATCH | `/api/feature-definitions/:id` | Update a feature definition |
| DELETE | `/api/feature-definitions/:id` | Delete a feature definition |
| GET | `/api/feature-pricing` | List feature pricing rules |
| POST | `/api/feature-pricing` | Create a pricing rule (revenue_per_unit) |
| DELETE | `/api/feature-pricing/:featureKey` | Delete a pricing rule |

### Recommendations

| Method | Endpoint | What it does |
|--------|----------|-------------|
| GET | `/api/recommendations` | List optimization recommendations |
| POST | `/api/recommendations/compute` | Trigger recommendation engine |
| POST | `/api/recommendations/:id/apply` | Accept a recommendation |
| POST | `/api/recommendations/:id/dismiss` | Dismiss a recommendation |

### Agent discovery

| Method | Endpoint | What it returns |
|--------|----------|----------------|
| GET | `/.well-known/agent.json` | Agent card with all capabilities and endpoints |

---

## What NOT to do

- Do not install `@tanso/observe`. The correct package is `@tansohq/observe`.
- Do not replace OpenAI / Anthropic with anything else. Observe wraps the existing
  client, it does not substitute models.
- Observe logs cost + metadata by default. Optionally include `prompt` /
  `completion` in `properties` for debugging visibility — skip if content is
  sensitive or subject to privacy requirements.
- Do not hardcode the API key.
- Do not touch an existing AI gateway without asking the user. If you see Portkey,
  LiteLLM, Helicone, OpenRouter, or similar in `package.json`, use Path 2.
- Do not default to the SDK wrap (Path 3). It puts Observe in the critical
  path of every LLM call. Use Path 1 (direct ingest) unless the user
  explicitly asks for auto-instrumentation.
- Do not add Stripe integration as part of Observe setup unless the user asks.
  Stripe is optional.

---

## FAQ — Revenue Attribution

### How does revenue get attributed to events?

Three methods, in priority order:

1. **Explicit `revenueAmount`** — pass it on the SDK event. Overrides everything.
   Best when: the developer knows the exact revenue for that API call.

2. **Feature pricing rules** — set `revenue_per_unit` per feature in the Observe
   dashboard (/features page). Every event with that `featureKey` gets stamped
   automatically. Best when: pricing is per-feature (e.g., $0.01 per chat message).

3. **Stripe subscription match** — connect Stripe, sync customers/subscriptions.
   Observe detects the pricing model and calculates revenue per event:
   - **Metered**: `unitPrice × usageUnits` → exact per-event revenue
   - **Tiered**: `tierPrice(monthToDateUsage) × usageUnits` → tier-aware revenue
   - **Flat**: $0 per event — MRR is tracked at the customer level, not per-event
   - **Hybrid**: metered component calculated, flat component at customer level

### Where does revenue show in the UI?

| Level | What shows | Source |
|-------|-----------|--------|
| **Event** | `revenue_amount` | Metered/tiered/explicit/feature pricing only. Flat = $0. |
| **Feature** | Sum of event revenue | Only meaningful for metered/tiered/feature pricing. |
| **Customer** | MRR from subscription | Always shows — pulled from `subscriptions` table. |
| **Cohorts** | MRR column + cost | MRR from subscription, cost from events. |

### Why is revenue $0 on my events?

Most likely: your customers have **flat-rate subscriptions**. Flat subscriptions
contribute $0 per event because MRR is fixed regardless of usage. This is by
design — revenue shows at the customer level (MRR column on Cohorts page).

To get per-event revenue, either:
- Switch to metered/tiered Stripe pricing
- Set feature pricing rules in the dashboard
- Pass explicit `revenueAmount` on each SDK event

### How do I connect Stripe for revenue?

1. Go to Data Sources → Stripe → Connect
2. Paste a Stripe API key (restricted key with read access to customers,
   subscriptions, and prices)
3. Click Sync — pulls customers, subscriptions, plans
4. Revenue enrichment runs automatically on new events AND backfills existing ones
5. Pass `meta.stripe_customer_id` on SDK events if your `customerReferenceId`
   differs from the Stripe `cus_*` ID

### What's the difference between cost and revenue?

- **Cost** = what YOU pay the LLM provider (auto-calculated from model + tokens)
- **Revenue** = what YOUR CUSTOMER pays you (from Stripe subscription or explicit)
- **Margin** = (revenue - cost) / revenue — for flat subscriptions, the cohorts
  endpoint uses subscription MRR as revenue (not per-event $0). Feature-level
  margin is null for flat subs since MRR can't be attributed to individual features.
- **Period** = margin is scoped to a billing period (This Month by default).
  Switch between This Month, Last Month, and All Time on the Customer Detail
  and Cohorts pages. MRR is compared against cost within that period.

**Q: Why does margin change when I switch periods?**

Margin is calculated as MRR vs cost within the selected billing period. "This
Month" shows cost accumulated so far this month against your monthly MRR.
"All Time" shows lifetime cost against MRR — which can look worse because cost
accumulates over months but MRR is a single month's revenue.

**Q: My customer has multiple subscriptions with different pricing models. How
does Observe detect "Hybrid"?**

If a customer has both a flat and a metered subscription (or any mix of distinct
pricing models), Observe automatically labels them "Hybrid". The cohorts query
uses `COUNT(DISTINCT pricing_model) > 1` to detect this. Single-model customers
show their actual model (Flat, Metered, or Tiered).

**Q: Why does my event show $0 cost when I sent tokens?**

Very short messages (under ~100 tokens) with cheap models like gpt-4o-mini
can produce costs below $0.0001 — e.g., 13 input tokens at $0.15/M =
$0.000002. Observe shows these as "<$0.0001" in the UI. The cost IS
calculated, it's just too small to display at 4 decimal places. Send a longer
message or check the event detail view for the exact input/output cost split.