Skip to main content
Optimization is the primary workspace for inference-time control. Instead of switching between separate vector, sampling, and testing pages, you configure them together here around the currently selected model.

What lives in Optimization

  • Selected model context so every control is scoped to the model you are actively tuning
  • Vector library and custom vectors in one place
  • Behavior-aware generation using SAE-monitored runtime controls
  • Sampling configuration for decoding strategy and method-specific parameters
  • Test generation to validate the active stack before using it in production
  • API snippet generation so the current configuration can be copied directly into application code

Typical workflow

1

Select a model

Start with the trained or hosted model you want to tune.
2

Attach vectors or behaviors

Enable library vectors, your own vectors, or SAE-monitored behavior rules depending on what you are trying to achieve.
3

Tune decoding

Choose the sampling strategy and adjust the same decoding controls whether you test a standard run or an SAE behavior-aware run.
4

Run test generations

Validate the current configuration before exporting it into an app or production workflow.
5

Copy the API snippet

Use the generated request example as the starting point for your client integration. You can generate it from the active model and saved API key, or switch to custom values when preparing code for a different model or key.

Test generation

The Test generation panel uses the current model, staged vectors or behaviors, and the sampling rail. Standard and SAE modes expose the same sampling and output-length controls so you can compare behavior without reconfiguring the request. When SAE mode is selected, the panel also shows runtime state for the applied behaviors and whether corrective steering is forced or monitor-only.

API snippets

The snippet dialog can generate cURL, Python, and Typescript examples. By default, snippets use the active model and the saved API key for the signed-in account. Switch to Custom values in the dialog when you need the same request shape with a different model name or API key. The custom model value is sent as the request-level model hint, so use the hosted model name or ID shown in Spectra.

Common use cases

Safer or more constrained responses

  • attach a safety-oriented vector
  • lower creativity-oriented sampling
  • optionally enable SAE-monitored behaviors for drift correction

Tone shaping

  • enable a persona or style vector
  • test different strength ranges
  • keep the underlying model unchanged

Structured output

  • switch the decoding method to Guided Generation
  • choose one constraint type: JSON schema, Regex, or Grammar
  • guided generation depends on model/runtime support, so validate the behavior on the exact model you plan to ship
  • when using JSON schema, the UI checks that the schema is valid JSON before sending the request
  • the generated API snippet uses Axionic’s /sampling/generate endpoint with json_schema, regex_pattern, or grammar
  • for strict production formatting, prefer JSON schema or Regex; Grammar is better suited to lightweight format guidance
  • this advanced sampling path does not include the staged steering vector or behavior configuration shown elsewhere in the workspace
  • if your app uses an OpenAI SDK client, translate the same constraint into an inline policy passed through extra_body