Skip to main content
Spectra’s Training page is split into two paths:
  • Distillation transfers behavior from a teacher model into a smaller student.
  • Fine-tuning adapts a selected base model to your own labeled examples.

Training choices

PathUse it forData
Distillation - Tool CallingTeaching a smaller model to call tools correctlyTool schemas, seed prompts, generated trajectories
Distillation - Text GenerationTurning unlabeled inputs into supervised examples with a teacher modelInput-only JSONL or CSV plus named system prompts
Fine-tuning - Text GenerationAdapting a model to existing input-output examplesPairs JSONL or CSV, with optional augmentation
Fine-tuning - Speech RecognitionTraining a Whisper transcription model.wav audio files and a manifest JSONL
Fine-tuning - OCRTraining a vision-language OCR modelImage ZIP and a manifest JSONL

Training modes

  • SFT only: supervised fine-tuning.
  • SFT + RL: SFT followed by GRPO reinforcement learning.
GRPO is only available for the tool-calling distillation path. Text, speech recognition, and OCR training use SFT.

Distillation workflows

Tool calling

Use this when you want a smaller student model to learn tool calls and structured behavior from a teacher model. The training pipeline does not treat natural-language tool descriptions as free-form prompts. Before trajectory generation, Spectra normalizes JSON schemas, OpenAPI imports, and natural-language tool descriptions into canonical typed tool schemas, then trains the model to emit structured tool calls that satisfy those schemas.
1

Choose teacher and student models

Select the teacher model that generates traces and the student model that will be trained.
2

Define tools

Add tools from JSON schema, OpenAPI imports, or natural-language descriptions. Review the active tool list before submitting the run.
3

Set objectives and generation parameters

Configure seed prompts, trajectories, and training objectives that describe how the model should behave.
4

Tune SFT and optional GRPO settings

Adjust epochs, learning rates, batch sizes, and checkpoint cadence based on the run size and available credits.

Text generation

Use this when you have inputs but want the teacher model to produce the supervised outputs before the student is trained. Upload input-only data, add named system prompts in the app, choose the teacher and student, then start the run.

Fine-tuning workflows

Text generation

Use this when you already have a curated text dataset and do not need teacher-generated tool trajectories. You can:
  • upload JSONL or CSV data
  • include system prompts that frame each task (per row in your dataset, or named in the app for input-only mode)
  • train directly from examples instead of generating tool-calling trajectories
  • optionally augment a small pairs dataset before training

Speech recognition

Use this when you want to fine-tune a Whisper transcription model. The current workflow uses Whisper Large v3, accepts .wav uploads, and requires a JSONL manifest where each row points to an uploaded audio file. Each manifest row uses:
  • audio: uploaded .wav filename
  • text: target transcript
  • language: optional ISO language code
Spectra checks the manifest against the uploaded files, rejects archive bundles for audio uploads, flags oversized clips, and can detect or override per-row language values before training starts.

OCR

Use this when you want to fine-tune an OCR model on paired images and text. The current workflow uses DeepSeek-OCR, accepts an image ZIP, and requires a JSONL manifest where each row points to an image in that ZIP. Each manifest row uses:
  • image: image filename inside the ZIP
  • text: target OCR text
  • prompt: optional task prompt
Spectra accepts jpg, jpeg, png, webp, bmp, tif, and tiff images, checks the ZIP against the manifest, and blocks runs with missing images, unreferenced files, non-image entries, or an empty manifest.

Text dataset formats

Spectra accepts .jsonl and .csv uploads for text-mode training.

Input-output pairs

Use this format when every row already includes the per-row instruction, the user input, and the target output. Each row needs three fields:
  • system_prompt — the per-row instruction that tells the model how to behave on this example.
  • input — the user-side message the model receives.
  • output — the target assistant response the model should produce.
Accepted JSONL shapes — pick one of the following per row:
{"system_prompt": "You are a concise assistant.", "input": "Summarize the following note", "output": "A short summary"}
Accepted CSV shape — three columns:
system_promptinputoutput
You are a concise assistant.Summarize the following noteA short summary
You rewrite text in a polite, professional tone.Rewrite this email more politelyA more polite version
In the file, that looks like:
system_prompt,input,output
"You are a concise assistant.","Summarize the following note","A short summary"
"You rewrite text in a polite, professional tone.","Rewrite this email more politely","A more polite version"
Rules:
  • input is required
  • output is required in Input-output pairs mode
  • system_prompt is required per row in Input-output pairs mode (for flat JSONL and CSV shapes)
  • if you use messages, the array must include a system message, a user message, and an assistant message in pairs mode

Input only

Use this format when you only have user inputs and want Spectra to generate the matching outputs for you. How it works:
  • You upload a dataset that contains only input rows (no outputs, no per-row instructions).
  • In the app, you add one or more named system prompts — each one is a different instruction style you want the model to learn.
  • The teacher model combines each input with each named system prompt and generates the matching output. The resulting (input, output) pairs become your training data.
So 100 inputs × 2 named system prompts = 200 training pairs. Accepted JSONL shapes — pick one of the following per row:
{"input": "Summarize the following note"}
Accepted CSV shape — one column:
input
Summarize the following note
Rewrite this email more politely
In the file, that looks like:
input
"Summarize the following note"
"Rewrite this email more politely"
Rules:
  • input is required
  • do not include output in Input only mode
  • after uploading input-only data, add at least one named system prompt in the app — without it, training cannot start

Field-name behavior

Field names are matched case-insensitively, so input, INPUT, and Input are all accepted. The same rule applies to CSV column headers and JSON keys. Spectra does not accept alternate names like prompt, response, question, or answer. Rename those columns or JSON keys to input and output first.

Private model storage

Finished models are stored privately. Your current private-model allowance depends on your account tier and appears on Billing.

Teacher API keys

Bring your own OpenAI, Anthropic, or Google AI keys in Settings if you want to use your own teacher credentials. BYOK usage is not billed by Spectra.

After training

When a text or tool run succeeds, the model appears in Models and becomes available for runtime testing and inference. The next step is usually Optimization, where you can attach vectors, test generations, and produce API-ready request snippets. Speech recognition and OCR runs also appear in Models when complete. Use Models to review their status and training artifacts; they are not called through the current text-generation API path.