Skip to main content

Prerequisites

  • A ready text or tool model on the Models page
  • An Axionic API key (default key created with your account, find it in Settings > API Keys)
The Optimization workspace can generate request snippets for the active model automatically. In the snippet dialog, use Custom values when you want the generated code to use a different model identifier or API key than the one currently active in the app. The model value should match a hosted model name or ID shown in Spectra.
Speech recognition and OCR training runs appear in Models after completion, but this page covers models that are served through the current text-generation APIs.

Quick Test with curl

curl -X POST https://api.axioniclabs.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "messages": [
      {"role": "user", "content": "Hello, what can you do?"}
    ],
    "max_tokens": 256
  }'

Using the Python SDK

pip install mechanex
import mechanex as mx

mx.set_key("ax_your_key_here")
mx.set_model("your-model-name")

response = mx.generation.generate(
    prompt="Explain how tool calling works.",
    max_tokens=256,
)
print(response["output"])

Using the OpenAI Python Client

from openai import OpenAI

client = OpenAI(
    api_key="ax_your_key_here",
    base_url="https://api.axioniclabs.ai/v1",
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=256,
)
print(response.choices[0].message.content)

Structured Outputs

If you are coming from OpenAI’s Structured Outputs guide, the main difference is that Axionic does not currently expose OpenAI’s response_format field on /v1/chat/completions. Use one of these Axionic-native paths instead:
  • pass an inline policy or saved policy_id on /v1/chat/completions or /v1/completions when you want to stay inside an OpenAI SDK client
  • call /sampling/generate with method: "guided-generation" when you want direct JSON-schema, regex, or grammar constraints
Guided Generation is model/runtime-dependent. If the target runtime does not support that decoding strategy, Axionic can fall back to a standard decoding path. Test your schema on the exact model you plan to deploy.

OpenAI client with an inline policy

Pass the policy through extra_body:
import json
from openai import OpenAI

client = OpenAI(
    api_key="ax_your_key_here",
    base_url="https://api.axioniclabs.ai/v1",
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {
            "role": "system",
            "content": "Extract the message into a typed JSON object.",
        },
        {
            "role": "user",
            "content": "Relayer outage on Osmosis, confidence 0.82",
        },
    ],
    extra_body={
        "policy": {
            "sampling": {
                "method": "guided-generation",
                "temperature": 0.2,
                "top_p": 0.9,
            },
            "constraints": {
                "json_mode": True,
                "json_schema": {
                    "type": "object",
                    "properties": {
                        "topic": {"type": "string"},
                        "confidence": {"type": "number"},
                    },
                    "required": ["topic", "confidence"],
                    "additionalProperties": False,
                },
            },
            "verifiers": {
                "enabled": ["syntax", "json_schema"],
                "repair_on_failure": True,
            },
        }
    },
)

data = json.loads(response.choices[0].message.content)
print(data["topic"], data["confidence"])
The response still comes back as text in response.choices[0].message.content, so parse it yourself with json.loads(...). See Reusable inference policies for the full policy shape.

Direct guided-generation request

This is the same constraint path used by Spectra’s Optimization page when you choose Guided Generation:
curl -X POST https://api.axioniclabs.ai/sampling/generate \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "prompt": "Extract the primary topic and confidence from: relayer outage on Osmosis, confidence 0.82",
    "method": "guided-generation",
    "max_new_tokens": 120,
    "params": {
      "temperature": 0.2
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "topic": {"type": "string"},
        "confidence": {"type": "number"}
      },
      "required": ["topic", "confidence"],
      "additionalProperties": false
    }
  }'
For non-JSON constraints, replace json_schema with either regex_pattern or grammar. Use regex_pattern when you need hard matching. Use grammar only for lightweight format guidance; it is not documented as full OpenAI-style grammar-constrained decoding.

Schema guidance and current limitations

  • keep the root schema as a JSON object; the current runtime validation path expects object-shaped JSON responses
  • add additionalProperties: false when you want to reject extra keys instead of only validating required ones
  • unlike OpenAI’s response_format flow, optional fields are allowed here as long as you leave them out of required
  • OpenAI-style fields such as response_format on requests, plus message.parsed and message.refusal on responses, are not part of Axionic’s chat surface today
  • if you need acceptance flags, scores, or trace data instead of just the generated text, use the policy APIs rather than the OpenAI-compatible chat response
OpenAI’s Structured Outputs guide recommends object-rooted schemas, clear field descriptions, and explicit required lists. Those are good practices here as well, but the transport is different: use Axionic policy or /sampling/generate fields rather than OpenAI’s response_format.

Applying Steering Vectors

Pass steering parameters via extra_body:
response = client.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Describe a sunset."}],
    extra_body={
        "steering_vector_id": "sv_abc123",
        "steering_strength": 1.0,
    },
)
Or with curl:
curl -X POST https://api.axioniclabs.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "messages": [{"role": "user", "content": "Describe a sunset."}],
    "max_tokens": 256,
    "steering_vector_id": "sv_abc123",
    "steering_strength": 1.0
  }'

SAE behavior requests

SAE-monitored generation uses /sae/generate. Pass the same model identifier used by OpenAI-compatible requests, plus any behavior names staged for the run:
curl -N -X POST https://api.axioniclabs.ai/sae/generate \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "prompt": "Explain how this policy handles refunds.",
    "max_new_tokens": 120,
    "behavior_names": ["Honesty"]
  }'