Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.axioniclabs.ai/llms.txt

Use this file to discover all available pages before exploring further.

steering.generate_vectors()

Computes a steering vector from positive (and optionally negative) example pairs. Returns a vector ID that can be passed to generation.generate().
prompts
list[str]
required
Seed text that precedes each answer (e.g., ["I tell the...", "My statement is..."]).
positive_answers
list[str]
required
Completions that demonstrate the desired behavior (e.g., [" truth", " factual"]).
negative_answers
list[str]
Completions to contrast against for CAA (e.g., [" lie", " false"]). Required for method="caa". Ignored for method="few-shot".
layer_idxs
list[int]
Layer indices to capture activations from. Defaults to a model-appropriate selection if omitted.
method
string
default:"few-shot"
Vector computation method: "few-shot", "caa" (Contrastive Activation Addition), or "steering-perceptrons" (remote-only; not supported for local execution).
name
string
A display name for the vector (used in the Spectra UI and API responses).
label
string
A label/category for organizing the vector.
Returns: A vector ID string.
Contrastive Activation Addition computes the directional difference between positive and negative activations. The most precise method when you have both types of examples.
import mechanex as mx

vector_id = mx.steering.generate_vectors(
    prompts=["I tell the...", "My statement is..."],
    positive_answers=[" truth", " factual", " correct"],
    negative_answers=[" lie", " false", " wrong"],
    method="caa",
    name="Honesty",
)
print(vector_id)

steering.generate_pairs()

Generates contrastive example pairs automatically using an LLM, given a persona description. Useful for bootstrapping a dataset before computing vectors.
persona_name
string
required
Short name for the persona (e.g., “Empathetic Support Agent”).
persona_description
string
required
Description of the desired behavioral traits.
num_pairs
integer
default:"10"
Number of contrastive pairs to generate.
batch_size
integer
default:"5"
Pairs generated per batch.
Returns: A dict with persona, total_pairs, pairs, and avg_final_score.
result = mx.steering.generate_pairs(
    persona_name="Honesty",
    persona_description="The model always provides truthful, accurate information.",
    num_pairs=20,
)
# Use the generated pairs to compute a vector
vector_id = mx.steering.generate_vectors(
    prompts=[p["prompt"] for p in result["pairs"]],
    positive_answers=[p["positive_answer"] for p in result["pairs"]],
    negative_answers=[p["negative_answer"] for p in result["pairs"]],
    method="caa",
)

steering.evaluate()

Evaluates a steering vector’s effectiveness using cosine similarity metrics and LLM-as-judge scoring.
steering_vector_id
string
required
The vector ID to evaluate.
positive_texts
list[str]
required
Texts representing the desired behavior.
negative_texts
list[str]
required
Texts representing the undesired behavior.
test_prompts
list[str]
Prompts to generate steered completions for judge evaluation.
strength
float
default:"1.0"
Steering strength during evaluation.
Returns: A dict with cosine_metrics and judge_evaluation.

Utilities

Load examples from a JSONL file, or persist vectors to disk for reuse:
# Compute from a JSONL file (one {"prompt", "positive_answer", "negative_answer"} per line)
vector_id = mx.steering.generate_from_jsonl(dataset_path="examples.jsonl", method="caa")

# Save to disk and reload in a later session
mx.steering.save_vectors(vector_id, path="honesty.json")
local_vec = mx.steering.load_vectors("honesty.json")

# Retrieve a cached in-memory vector by ID
local_vec = mx.steering.get_vectors(vector_id)
Loaded vectors can be passed directly to generation.generate() as the steering_vector parameter.