Structured Outputs
JSON schema and JSON object modes for guaranteed structured responses
Function Calling
Connect LLMs to external tools, APIs, and databases
Reasoning Models
Control chain-of-thought depth for DeepSeek-R1 and Qwen3
Context & Metadata
Graceful truncation for long prompts, cost attribution metadata
Basic usage
- Python
- JavaScript
- cURL
Response
Parameters
Core parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | Model ID, e.g. forii/deepseek-v3 |
messages | array | Yes | — | Conversation messages with role and content |
temperature | float | No | 0.7 | Sampling randomness (0 = deterministic, 2 = creative) |
max_tokens | integer | No | 2048 | Maximum tokens in the completion |
top_p | float | No | 1 | Nucleus sampling threshold |
stream | boolean | No | false | Stream tokens as they arrive |
stop | string|array | No | — | Up to 4 stop sequences |
n | integer | No | 1 | Number of completions |
frequency_penalty | float | No | — | -2 to 2 |
presence_penalty | float | No | — | -2 to 2 |
seed | integer | No | — | Deterministic sampling |
logprobs | boolean | No | — | Return log probabilities |
top_logprobs | integer | No | — | 0–5 top log probs per position |
user | string | No | — | End-user identifier |
Structured output & tool parameters
| Parameter | Type | Description |
|---|---|---|
response_format | object | {"type": "json_object"} or {"type": "json_schema", "json_schema": {...}} |
tools | array | Function/tool definitions |
tool_choice | string|object | auto, none, required, or {"type": "function", "name": "..."} |
parallel_tool_calls | boolean | Enable parallel function calls |
Forii extensions
| Parameter | Type | Description |
|---|---|---|
reasoning_effort | string | none | low | medium | high — see Reasoning Models |
context_length_exceeded_behavior | string | "truncate" (default) or "error" — see Context & Metadata |
repetition_penalty | float | 0–2, applies to both prompt and output |
top_k | integer | Top-K sampling |
metadata | object | Key-value string metadata for cost attribution |
Streaming
Stream tokens as they arrive using Server-Sent Events (SSE).- Python
- JavaScript