> ## Documentation Index
> Fetch the complete documentation index at: https://docs.axioniclabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Training

> Train custom models in Spectra with distillation, text fine-tuning, speech recognition, OCR, supervised fine-tuning, and optional GRPO.

Spectra's Training page is split into two paths:

* **Distillation** transfers behavior from a teacher model into a smaller student.
* **Fine-tuning** adapts a selected base model to your own labeled examples.

## Training choices

| Path                             | Use it for                                                             | Data                                               |
| -------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------- |
| Distillation - Tool Calling      | Teaching a smaller model to call tools correctly                       | Tool schemas, seed prompts, generated trajectories |
| Distillation - Text Generation   | Turning unlabeled inputs into supervised examples with a teacher model | Input-only JSONL or CSV plus named system prompts  |
| Fine-tuning - Text Generation    | Adapting a model to existing input-output examples                     | Pairs JSONL or CSV, with optional augmentation     |
| Fine-tuning - Speech Recognition | Training a Whisper transcription model                                 | `.wav` audio files and a manifest JSONL            |
| Fine-tuning - OCR                | Training a vision-language OCR model                                   | Image ZIP and a manifest JSONL                     |

## Training modes

* **SFT only**: supervised fine-tuning.
* **SFT + RL**: SFT followed by GRPO reinforcement learning.

GRPO is only available for the tool-calling distillation path. Text, speech recognition, and OCR training use SFT.

## Distillation workflows

### Tool calling

Use this when you want a smaller student model to learn tool calls and structured behavior from a teacher model.

The training pipeline does not treat natural-language tool descriptions as free-form prompts. Before trajectory generation, Spectra normalizes JSON schemas, OpenAPI imports, and natural-language tool descriptions into canonical typed tool schemas, then trains the model to emit structured tool calls that satisfy those schemas.

<Steps>
  <Step title="Choose teacher and student models">
    Select the teacher model that generates traces and the student model that will be trained.
  </Step>

  <Step title="Define tools">
    Add tools from JSON schema, OpenAPI imports, or natural-language descriptions. Review the active tool list before submitting the run.
  </Step>

  <Step title="Set objectives and generation parameters">
    Configure seed prompts, trajectories, and training objectives that describe how the model should behave.
  </Step>

  <Step title="Tune SFT and optional GRPO settings">
    Adjust epochs, learning rates, batch sizes, and checkpoint cadence based on the run size and available credits.
  </Step>
</Steps>

### Text generation

Use this when you have inputs but want the teacher model to produce the supervised outputs before the student is trained. Upload input-only data, add named system prompts in the app, choose the teacher and student, then start the run.

## Fine-tuning workflows

### Text generation

Use this when you already have a curated text dataset and do not need teacher-generated tool trajectories.

You can:

* upload JSONL or CSV data
* include system prompts that frame each task (per row in your dataset, or named in the app for input-only mode)
* train directly from examples instead of generating tool-calling trajectories
* optionally augment a small pairs dataset before training

### Speech recognition

Use this when you want to fine-tune a Whisper transcription model. The current workflow uses Whisper Large v3, accepts `.wav` uploads, and requires a JSONL manifest where each row points to an uploaded audio file.

Each manifest row uses:

* **`audio`**: uploaded `.wav` filename
* **`text`**: target transcript
* **`language`**: optional ISO language code

Spectra checks the manifest against the uploaded files, rejects archive bundles for audio uploads, flags oversized clips, and can detect or override per-row language values before training starts.

### OCR

Use this when you want to fine-tune an OCR model on paired images and text. The current workflow uses DeepSeek-OCR, accepts an image ZIP, and requires a JSONL manifest where each row points to an image in that ZIP.

Each manifest row uses:

* **`image`**: image filename inside the ZIP
* **`text`**: target OCR text
* **`prompt`**: optional task prompt

Spectra accepts `jpg`, `jpeg`, `png`, `webp`, `bmp`, `tif`, and `tiff` images, checks the ZIP against the manifest, and blocks runs with missing images, unreferenced files, non-image entries, or an empty manifest.

## Text dataset formats

Spectra accepts `.jsonl` and `.csv` uploads for text-mode training.

### Input-output pairs

Use this format when every row already includes the per-row instruction, the user input, and the target output.

Each row needs three fields:

* **`system_prompt`** — the per-row instruction that tells the model how to behave on this example.
* **`input`** — the user-side message the model receives.
* **`output`** — the target assistant response the model should produce.

Accepted JSONL shapes — pick **one** of the following per row:

<CodeGroup>
  ```json Flat shape theme={null}
  {"system_prompt": "You are a concise assistant.", "input": "Summarize the following note", "output": "A short summary"}
  ```

  ```json Chat shape theme={null}
  {"messages": [
    {"role": "system", "content": "You are a concise assistant."},
    {"role": "user", "content": "Summarize the following note"},
    {"role": "assistant", "content": "A short summary"}
  ]}
  ```
</CodeGroup>

Accepted CSV shape — three columns:

| system\_prompt                                   | input                            | output                |
| ------------------------------------------------ | -------------------------------- | --------------------- |
| You are a concise assistant.                     | Summarize the following note     | A short summary       |
| You rewrite text in a polite, professional tone. | Rewrite this email more politely | A more polite version |

In the file, that looks like:

```csv theme={null}
system_prompt,input,output
"You are a concise assistant.","Summarize the following note","A short summary"
"You rewrite text in a polite, professional tone.","Rewrite this email more politely","A more polite version"
```

Rules:

* `input` is required
* `output` is required in **Input-output pairs** mode
* `system_prompt` is required per row in **Input-output pairs** mode (for flat JSONL and CSV shapes)
* if you use `messages`, the array must include a `system` message, a `user` message, **and** an `assistant` message in pairs mode

### Input only

Use this format when you only have user inputs and want Spectra to generate the matching outputs for you.

How it works:

* You upload a dataset that contains **only `input` rows** (no outputs, no per-row instructions).
* In the app, you add one or more **named system prompts** — each one is a different instruction style you want the model to learn.
* The teacher model combines each input with each named system prompt and generates the matching output. The resulting (input, output) pairs become your training data.

So 100 inputs × 2 named system prompts = 200 training pairs.

Accepted JSONL shapes — pick **one** of the following per row:

<CodeGroup>
  ```json Flat shape theme={null}
  {"input": "Summarize the following note"}
  ```

  ```json Chat shape theme={null}
  {"messages": [
    {"role": "user", "content": "Rewrite this email more politely"}
  ]}
  ```
</CodeGroup>

Accepted CSV shape — one column:

| input                            |
| -------------------------------- |
| Summarize the following note     |
| Rewrite this email more politely |

In the file, that looks like:

```csv theme={null}
input
"Summarize the following note"
"Rewrite this email more politely"
```

Rules:

* `input` is required
* do not include `output` in **Input only** mode
* after uploading input-only data, add at least one **named system prompt** in the app — without it, training cannot start

### Field-name behavior

Field names are matched case-insensitively, so `input`, `INPUT`, and `Input` are all accepted. The same rule applies to CSV column headers and JSON keys.

Spectra does **not** accept alternate names like `prompt`, `response`, `question`, or `answer`. Rename those columns or JSON keys to `input` and `output` first.

## Private model storage

Finished models are stored privately. Your current private-model allowance depends on your account tier and appears on [Billing](/products/spectra/billing).

## Teacher API keys

Bring your own **OpenAI**, **Anthropic**, or **Google AI** keys in [Settings](/products/spectra/settings) if you want to use your own teacher credentials. BYOK usage is not billed by Spectra.

## After training

When a text or tool run succeeds, the model appears in [Models](/products/spectra/models) and becomes available for runtime testing and inference. The next step is usually [Optimization](/products/spectra/optimization), where you can attach vectors, test generations, and produce API-ready request snippets.

Speech recognition and OCR runs also appear in Models when complete. Use Models to review their status and training artifacts; they are not called through the current text-generation API path.