Responses API Reference | MatterAI Documentation

Authentication

All API requests require authentication using a Bearer token. You can obtain your API key from the MatterAI Console.

Authorization: Bearer MATTERAI_API_KEY

Keep your API key secure and never expose it in client-side code. Get your API key from the MatterAI console.

Request

model

string

required

The model used for the response. Available models: "axon-2-5-pro", "axon-2-5-mini".

input

string or array

required

Text or array of input items to the model, used to generate a response. Accepts a plain string (equivalent to a "user" message) or an array of input items.

Show Input Item (EasyInputMessage)

role

string

required

The role of the message. One of "user", "assistant", "system", or "developer".

content

string or array

required

Text content or an array of content blocks.

Show Content Block

type

string

required

The type of content. Use "input_text" for text content, "input_image" for image URLs.

text

string

The text content (for input_text blocks).

image_url

string

The URL of the image (for input_image blocks).

instructions

string

A system (or developer) message inserted into the model’s context. When used with previous_response_id, instructions from a previous response are not carried over to the next response. Equivalent to the "system" role in chat completions.

max_output_tokens

integer

default:"512"

An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.

stream

boolean

default:"false"

Whether to stream the response as it’s generated using server-sent events.

reasoning

object

Configuration for reasoning capabilities.

Show Reasoning Object

effort

string

default:"medium"

The level of reasoning effort. Options: "none", "low", "medium", "high".

summary

string

The level of reasoning summary. Options: "auto", "concise", "detailed".

temperature

number

default:"0.1"

Controls randomness in the output. Higher values make output more random, lower values make it more focused and deterministic. Range: 0.0 to 2.0.

top_p

number

default:"1"

Controls diversity via nucleus sampling. Range: 0.0 to 1.0.

text

object

Configuration options for a text response from the model.

Show Text Config Object

format

object

An object specifying the format that the model must output.

Show Format Object

type

string

default:"text"

The type of response format. Options: "text", "json_schema", "json_object".

name

string

The name of the response format (for json_schema).

schema

object

The JSON Schema for the response format (for json_schema).

strict

boolean

Whether to enable strict schema adherence (for json_schema).

verbosity

string

Constrains the verbosity of the model’s response. Options: "low", "medium", "high".

store

boolean

default:"true"

Whether to store the generated model response for later retrieval via API.

metadata

object

Set of up to 16 key-value pairs that can be attached to the response. Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

Response

string

Unique identifier for this response.

object

string

The object type, which is always "response".

status

string

The status of the response generation. One of "completed", "failed", "in_progress", "cancelled", or "incomplete".

created_at

integer

Unix timestamp (in seconds) of when this response was created.

model

string

The model used to generate the response. Available models: "axon-2-5-pro", "axon-2-5-mini".

output

array

An array of content items generated by the model.

Show Output Item (Message)

string

The unique ID of the output message.

type

string

The type of the output item. Always "message".

role

string

The role of the output message. Always "assistant".

status

string

The status of the message. One of "in_progress", "completed", "incomplete".

content

array

The content of the output message.

Show Content Block

type

string

The type of content. Values: "output_text", "refusal".

text

string

The text output from the model (for output_text blocks).

annotations

array

Annotations of the text output, such as citations.

output_text

string

SDK-only convenience property containing the aggregated text output from all output_text items in the output array.

usage

object

Usage statistics for the response request.

Show Usage Object

input_tokens

integer

Number of tokens in the input.

output_tokens

integer

Number of tokens in the generated output.

total_tokens

integer

Total number of tokens used (input + output).

output_tokens_details

object

A detailed breakdown of the output tokens.

Show Output Tokens Details

reasoning_tokens

integer

Number of tokens used for reasoning.

input_tokens_details

object

A detailed breakdown of the input tokens.

Show Input Tokens Details

cached_tokens

integer

Number of tokens retrieved from cache.

error

object

An error object returned when the model fails to generate a response.

Show Error Object

code

string

The error code. Possible values: "server_error", "rate_limit_exceeded", "invalid_prompt", etc.

message

string

A human-readable description of the error.

Example Request

curl --location 'https://api2.matterai.so/v1/responses' \
--header 'content-type: application/json' \
--header 'Authorization: Bearer MATTERAI_API_KEY' \
--data '{
  "model": "axon-2-5-pro",
  "input": "Tell me a short story about a curious robot."
}'

Example Response

{
  "id": "resp_abc123def456",
  "object": "response",
  "created_at": 1741476542,
  "status": "completed",
  "model": "axon-2-5-pro",
  "output": [
    {
      "id": "msg_abc123def456",
      "type": "message",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "In a gleaming city of tomorrow, a small robot named Bolt was built to sort packages.",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 27,
    "output_tokens": 94,
    "total_tokens": 121,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "input_tokens_details": {
      "cached_tokens": 0
    }
  }
}

Example: Multi-turn Conversation

To continue a conversation, pass the previous_response_id from the previous response:

curl --location 'https://api2.matterai.so/v1/responses' \
--header 'content-type: application/json' \
--header 'Authorization: Bearer MATTERAI_API_KEY' \
--data '{
  "model": "axon-2-5-pro",
  "input": "What happened next?",
  "previous_response_id": "resp_abc123def456"
}'

Example: With Reasoning

curl --location 'https://api2.matterai.so/v1/responses' \
--header 'content-type: application/json' \
--header 'Authorization: Bearer MATTERAI_API_KEY' \
--data '{
  "model": "axon-2-5-pro",
  "instructions": "You are a helpful assistant that explains complex topics simply.",
  "input": "Explain quantum entanglement in one paragraph.",
  "reasoning": {
    "effort": "medium"
  }
}'

Streaming

When stream is set to true, the API returns a stream of Server-Sent Events (SSE). The streaming events use the OpenAI Responses API format:

data: {"type":"response.created","response":{"id":"resp_abc123","object":"response","created_at":1741476542,"status":"in_progress","model":"axon-2-5-pro","output":[],"usage":null}}

data: {"type":"response.in_progress","response":{"id":"resp_abc123","object":"response","created_at":1741476542,"status":"in_progress","model":"axon-2-5-pro","output":[],"usage":null}}

data: {"type":"response.output_item.added","output_index":0,"item":{"id":"msg_abc123","type":"message","status":"in_progress","role":"assistant","content":[]}}

data: {"type":"response.content_part.added","item_id":"msg_abc123","output_index":0,"content_index":0,"part":{"type":"output_text","text":"","annotations":[]}}

data: {"type":"response.output_text.delta","item_id":"msg_abc123","output_index":0,"content_index":0,"delta":"Hello"}

data: {"type":"response.output_text.done","item_id":"msg_abc123","output_index":0,"content_index":0,"text":"Hello world!"}

data: {"type":"response.output_item.done","output_index":0,"item":{"id":"msg_abc123","type":"message","status":"completed","role":"assistant","content":[{"type":"output_text","text":"Hello world!","annotations":[]}]}}

data: {"type":"response.completed","response":{"id":"resp_abc123","object":"response","created_at":1741476542,"status":"completed","model":"axon-2-5-pro","output":[...],"usage":{"input_tokens":10,"output_tokens":12,"total_tokens":22}}}

Migrating from Chat Completions

The Responses API provides a cleaner interface for text generation. Key differences:

Chat Completions	Responses
`POST /v1/chat/completions`	`POST /v1/responses`
`messages` array	`input` (string or array)
`system` message role	`instructions` string parameter
`choices[0].message.content`	`output[].content[].text` or `output_text`
`max_tokens`	`max_output_tokens`
`finish_reason`	`status` field on response

Error Responses

The API returns standard HTTP status codes to indicate success or failure:

400

Bad Request

Invalid request parameters or malformed JSON.

401

Unauthorized

Invalid or missing API key.

429

Rate Limited

Too many requests. Please slow down.

500

Internal Server Error

Server error. Please try again later.

Example error response:

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Inference

Documentation Index

​Authentication

​Request

​Response

​Example Request

​Example Response

​Example: Multi-turn Conversation

​Example: With Reasoning

​Streaming

​Migrating from Chat Completions

​Error Responses

Authentication

Request

Response

Example Request

Example Response

Example: Multi-turn Conversation

Example: With Reasoning

Streaming

Migrating from Chat Completions

Error Responses