> ## Documentation Index
> Fetch the complete documentation index at: https://docs.axioniclabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Using Your Model

> Send inference requests to ready text and tool models via the OpenAI-compatible API, Mechanex Python SDK, or curl with optional steering vectors.

## Prerequisites

* A ready text or tool model on the Models page
* An Axionic API key (default key created with your account, find it in **Settings > API Keys**)

The Optimization workspace can generate request snippets for the active model automatically. In the snippet dialog, use **Custom values** when you want the generated code to use a different model identifier or API key than the one currently active in the app. The model value should match a hosted model name or ID shown in Spectra.

<Note>
  Speech recognition and OCR training runs appear in Models after completion, but this page covers models that are served through the current text-generation APIs.
</Note>

## Quick Test with curl

```bash theme={null}
curl -X POST https://api.axioniclabs.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "messages": [
      {"role": "user", "content": "Hello, what can you do?"}
    ],
    "max_tokens": 256
  }'
```

## Using the Python SDK

```bash theme={null}
pip install mechanex
```

```python theme={null}
import mechanex as mx

mx.set_key("ax_your_key_here")
mx.set_model("your-model-name")

response = mx.generation.generate(
    prompt="Explain how tool calling works.",
    max_tokens=256,
)
print(response["output"])
```

## Using the OpenAI Python Client

```python theme={null}
from openai import OpenAI

client = OpenAI(
    api_key="ax_your_key_here",
    base_url="https://api.axioniclabs.ai/v1",
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=256,
)
print(response.choices[0].message.content)
```

## Structured Outputs

If you are coming from OpenAI's Structured Outputs guide, the main difference is that Axionic does not currently expose OpenAI's `response_format` field on `/v1/chat/completions`.

Use one of these Axionic-native paths instead:

* pass an inline `policy` or saved `policy_id` on `/v1/chat/completions` or `/v1/completions` when you want to stay inside an OpenAI SDK client
* call `/sampling/generate` with `method: "guided-generation"` when you want direct JSON-schema, regex, or grammar constraints

<Note>
  Guided Generation is model/runtime-dependent. If the target runtime does not support that decoding strategy, Axionic can fall back to a standard decoding path. Test your schema on the exact model you plan to deploy.
</Note>

### OpenAI client with an inline policy

Pass the policy through `extra_body`:

```python theme={null}
import json
from openai import OpenAI

client = OpenAI(
    api_key="ax_your_key_here",
    base_url="https://api.axioniclabs.ai/v1",
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[
        {
            "role": "system",
            "content": "Extract the message into a typed JSON object.",
        },
        {
            "role": "user",
            "content": "Relayer outage on Osmosis, confidence 0.82",
        },
    ],
    extra_body={
        "policy": {
            "sampling": {
                "method": "guided-generation",
                "temperature": 0.2,
                "top_p": 0.9,
            },
            "constraints": {
                "json_mode": True,
                "json_schema": {
                    "type": "object",
                    "properties": {
                        "topic": {"type": "string"},
                        "confidence": {"type": "number"},
                    },
                    "required": ["topic", "confidence"],
                    "additionalProperties": False,
                },
            },
            "verifiers": {
                "enabled": ["syntax", "json_schema"],
                "repair_on_failure": True,
            },
        }
    },
)

data = json.loads(response.choices[0].message.content)
print(data["topic"], data["confidence"])
```

The response still comes back as text in `response.choices[0].message.content`, so parse it yourself with `json.loads(...)`.
See [Reusable inference policies](/products/mechanex/policies) for the full policy shape.

### Direct guided-generation request

This is the same constraint path used by Spectra's **Optimization** page when you choose **Guided Generation**:

```bash theme={null}
curl -X POST https://api.axioniclabs.ai/sampling/generate \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "prompt": "Extract the primary topic and confidence from: relayer outage on Osmosis, confidence 0.82",
    "method": "guided-generation",
    "max_new_tokens": 120,
    "params": {
      "temperature": 0.2
    },
    "json_schema": {
      "type": "object",
      "properties": {
        "topic": {"type": "string"},
        "confidence": {"type": "number"}
      },
      "required": ["topic", "confidence"],
      "additionalProperties": false
    }
  }'
```

For non-JSON constraints, replace `json_schema` with either `regex_pattern` or `grammar`.
Use `regex_pattern` when you need hard matching. Use `grammar` only for lightweight format guidance; it is not documented as full OpenAI-style grammar-constrained decoding.

### Schema guidance and current limitations

* keep the root schema as a JSON object; the current runtime validation path expects object-shaped JSON responses
* add `additionalProperties: false` when you want to reject extra keys instead of only validating required ones
* unlike OpenAI's `response_format` flow, optional fields are allowed here as long as you leave them out of `required`
* OpenAI-style fields such as `response_format` on requests, plus `message.parsed` and `message.refusal` on responses, are not part of Axionic's chat surface today
* if you need acceptance flags, scores, or trace data instead of just the generated text, use the policy APIs rather than the OpenAI-compatible chat response

<Note>
  OpenAI's Structured Outputs guide recommends object-rooted schemas, clear field descriptions, and explicit `required` lists. Those are good practices here as well, but the transport is different: use Axionic `policy` or `/sampling/generate` fields rather than OpenAI's `response_format`.
</Note>

## Applying Steering Vectors

Pass steering parameters via `extra_body`:

```python theme={null}
response = client.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Describe a sunset."}],
    extra_body={
        "steering_vector_id": "sv_abc123",
        "steering_strength": 1.0,
    },
)
```

Or with curl:

```bash theme={null}
curl -X POST https://api.axioniclabs.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "messages": [{"role": "user", "content": "Describe a sunset."}],
    "max_tokens": 256,
    "steering_vector_id": "sv_abc123",
    "steering_strength": 1.0
  }'
```

## SAE behavior requests

SAE-monitored generation uses `/sae/generate`. Pass the same `model` identifier used by OpenAI-compatible requests, plus any behavior names staged for the run:

```bash theme={null}
curl -N -X POST https://api.axioniclabs.ai/sae/generate \
  -H "Content-Type: application/json" \
  -H "x-api-key: ax_your_key_here" \
  -d '{
    "model": "your-model-name",
    "prompt": "Explain how this policy handles refunds.",
    "max_new_tokens": 120,
    "behavior_names": ["Honesty"]
  }'
```

<Cards>
  <Card title="Steering Vectors" icon="sparkles" href="/products/spectra/vectors">
    Activation steering for behavior control
  </Card>

  <Card title="Observability" icon="activity" href="/products/spectra/observe">
    Behavior detection and monitoring
  </Card>

  <Card title="SDK Reference" icon="book-open" href="/products/mechanex/generation">
    Full generation API reference
  </Card>
</Cards>
