> ## Documentation Index
> Fetch the complete documentation index at: https://docs.axioniclabs.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Steering vectors with the Mechanex SDK

> Use the Mechanex SDK to compute steering vectors with CAA or few-shot methods, evaluate effectiveness, and save or load vectors for reuse.

## `steering.generate_vectors()`

Computes a steering vector from positive (and optionally negative) example pairs. Returns a vector ID that can be passed to `generation.generate()`.

<ParamField body="prompts" type="list[str]" required>
  Seed text that precedes each answer (e.g., `["I tell the...", "My statement is..."]`).
</ParamField>

<ParamField body="positive_answers" type="list[str]" required>
  Completions that demonstrate the desired behavior (e.g., `[" truth", " factual"]`).
</ParamField>

<ParamField body="negative_answers" type="list[str]">
  Completions to contrast against for CAA (e.g., `[" lie", " false"]`). Required for `method="caa"`. Ignored for `method="few-shot"`.
</ParamField>

<ParamField body="layer_idxs" type="list[int]">
  Layer indices to capture activations from. Defaults to a model-appropriate selection if omitted.
</ParamField>

<ParamField body="method" type="string" default="few-shot">
  Vector computation method: `"few-shot"`, `"caa"` (Contrastive Activation Addition), or `"steering-perceptrons"` (remote-only; not supported for local execution).
</ParamField>

<ParamField body="name" type="string">
  A display name for the vector (used in the Spectra UI and API responses).
</ParamField>

<ParamField body="label" type="string">
  A label/category for organizing the vector.
</ParamField>

**Returns**: A vector ID string.

<Tabs>
  <Tab title="CAA">
    Contrastive Activation Addition computes the directional difference between positive and negative activations. The most precise method when you have both types of examples.

    ```python theme={null}
    import mechanex as mx

    vector_id = mx.steering.generate_vectors(
        prompts=["I tell the...", "My statement is..."],
        positive_answers=[" truth", " factual", " correct"],
        negative_answers=[" lie", " false", " wrong"],
        method="caa",
        name="Honesty",
    )
    print(vector_id)
    ```
  </Tab>

  <Tab title="Few-Shot">
    Few-Shot optimizes a steering direction from positive examples only. Simpler to set up; works well when negative examples are unavailable.

    ```python theme={null}
    vector_id = mx.steering.generate_vectors(
        prompts=["The response was...", "I would say..."],
        positive_answers=[" helpful", " accurate", " clear"],
        method="few-shot",
    )
    ```
  </Tab>
</Tabs>

## `steering.generate_pairs()`

Generates contrastive example pairs automatically using an LLM, given a persona description. Useful for bootstrapping a dataset before computing vectors.

<ParamField body="persona_name" type="string" required>
  Short name for the persona (e.g., "Empathetic Support Agent").
</ParamField>

<ParamField body="persona_description" type="string" required>
  Description of the desired behavioral traits.
</ParamField>

<ParamField body="num_pairs" type="integer" default="10">
  Number of contrastive pairs to generate.
</ParamField>

<ParamField body="batch_size" type="integer" default="5">
  Pairs generated per batch.
</ParamField>

**Returns**: A dict with `persona`, `total_pairs`, `pairs`, and `avg_final_score`.

```python theme={null}
result = mx.steering.generate_pairs(
    persona_name="Honesty",
    persona_description="The model always provides truthful, accurate information.",
    num_pairs=20,
)
# Use the generated pairs to compute a vector
vector_id = mx.steering.generate_vectors(
    prompts=[p["prompt"] for p in result["pairs"]],
    positive_answers=[p["positive_answer"] for p in result["pairs"]],
    negative_answers=[p["negative_answer"] for p in result["pairs"]],
    method="caa",
)
```

## `steering.evaluate()`

Evaluates a steering vector's effectiveness using cosine similarity metrics and LLM-as-judge scoring.

<ParamField body="steering_vector_id" type="string" required>
  The vector ID to evaluate.
</ParamField>

<ParamField body="positive_texts" type="list[str]" required>
  Texts representing the desired behavior.
</ParamField>

<ParamField body="negative_texts" type="list[str]" required>
  Texts representing the undesired behavior.
</ParamField>

<ParamField body="test_prompts" type="list[str]">
  Prompts to generate steered completions for judge evaluation.
</ParamField>

<ParamField body="strength" type="float" default="1.0">
  Steering strength during evaluation.
</ParamField>

**Returns**: A dict with `cosine_metrics` and `judge_evaluation`.

## Utilities

Load examples from a JSONL file, or persist vectors to disk for reuse:

```python theme={null}
# Compute from a JSONL file (one {"prompt", "positive_answer", "negative_answer"} per line)
vector_id = mx.steering.generate_from_jsonl(dataset_path="examples.jsonl", method="caa")

# Save to disk and reload in a later session
mx.steering.save_vectors(vector_id, path="honesty.json")
local_vec = mx.steering.load_vectors("honesty.json")

# Retrieve a cached in-memory vector by ID
local_vec = mx.steering.get_vectors(vector_id)
```

Loaded vectors can be passed directly to `generation.generate()` as the `steering_vector` parameter.
