> ## Documentation Index
> Fetch the complete documentation index at: https://docs.forii.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Observability Overview

> How Forii handles monitoring, usage tracking, and debugging

<img src="https://mintcdn.com/forii-docs/cNsW6DAG7qkXCpMH/docs/images/dashboards/dashboard-observability.svg?fit=max&auto=format&n=cNsW6DAG7qkXCpMH&q=85&s=1e78a04a00795c9071244c727e5d32f8" alt="Observability Dashboard" width="1200" height="620" data-path="docs/images/dashboards/dashboard-observability.svg" />

Every Forii response includes token counts. Every error follows the OpenAI format. Rate limit headers tell you when to back off.

## Available now

### Token counts in response

```json theme={null}
{
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 156,
    "total_tokens": 174,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}
```

Every response includes token counts. `cached_tokens` enables prompt caching discounts later.

### Error codes

| Code | Meaning         | Retry?        |
| ---- | --------------- | ------------- |
| 400  | Bad request     | No            |
| 401  | Invalid API key | No            |
| 402  | Quota exceeded  | No            |
| 404  | Model not found | No            |
| 429  | Rate limit      | Yes (backoff) |
| 500  | Internal error  | Yes           |
| 503  | Unavailable     | Yes           |

Full details: [Errors & Rate Limits](/docs/api-reference/errors)

### Rate limit headers

```
X-Ratelimit-Limit-Requests: 600
X-Ratelimit-Remaining-Requests: 543
X-Ratelimit-Reset: 1705312200
```

### Dashboard

The [control panel](/docs/webapp/dashboard) provides:

* **API keys** — create and delete keys
* **Usage** — token limit and token counter by model, next reset time
* **Recent requests** — last few API requests and responses

## Coming soon

### Response headers

| Header                      | Description                                      |
| --------------------------- | ------------------------------------------------ |
| `x-forii-prompt-tokens`     | Token count verification without parsing body    |
| `x-forii-completion-tokens` | Token count verification without parsing body    |
| `x-forii-cached-tokens`     | Cache hit visibility (when prompt caching ships) |
| `x-forii-ttft-ms`           | Time to first token                              |
| `x-forii-total-ms`          | Total request time                               |
| `x-forii-model`             | Actual model served (resolves aliases)           |
| `x-forii-request-id`        | Trace individual requests                        |
| `x-forii-region`            | Which data center served the request             |

<Info>
  `x-forii-region` lets you verify your requests are served from India, not routed to US servers. Indian jurisdiction applies to all request data — no US CLOUD Act exposure.
</Info>

### CLI observability

```bash theme={null}
forii chat --verbose          # TTFT, total time, tokens, cost, model, region
forii chat --save-response resp.json  # Dump full response with headers
forii models list              # Available models, context windows, pricing
forii usage                    # Token usage this period, cost estimate
forii usage --by-model        # Break down per model
forii usage --by-key           # Break down per API key
```

### Request annotations

```bash theme={null}
curl https://api.forii.in/inference/v1/chat/completions \
  -H "Authorization: Bearer $FORII_API_KEY" \
  -H "x-forii-annotations: team=search,project=ranker,environment=prod" \
  -d '{"model":"forii/deepseek-v3","messages":[...]}'
```

Attribute costs to teams, projects, and environments.

## Planned

### Advanced dashboard

* Latency percentiles (p50/p90/p99) by model
* Error rate trends
* Cache hit rate visualization
* Rate limit utilization
* Region breakdown
* Annotations filtering
* CSV/JSON export

### Prometheus metrics endpoint

```yaml theme={null}
global:
  scrape_interval: 60s
scrape_configs:
  - job_name: 'forii'
    metrics_path: 'v1/accounts/{account_id}/metrics'
    authorization:
      type: Bearer
      credentials: YOUR_FORII_API_KEY
    static_configs:
      - targets: ['api.forii.in']
    scheme: https
```

### External integrations

| Integration          | How                                 |
| -------------------- | ----------------------------------- |
| Prometheus           | Scrape metrics endpoint             |
| Grafana Cloud        | Direct ingestion                    |
| Datadog              | Agent Prometheus receiver           |
| OpenTelemetry        | Prometheus receiver → OTel exporter |
| LangSmith / Langfuse | Callback/tracing SDK integration    |

## Related

* [Usage API](/docs/observability/usage-api) — Programmatic access to usage data
* [Errors & Rate Limits](/docs/api-reference/errors) — Error codes and headers
* [Dashboard](/docs/webapp/dashboard) — Usage and requests in the UI