Reusable inference policies

Overview

Bundle sampling, steering, constraints, and verification into a reusable configuration.

`policy.run()`

Runs generation with a policy configuration applied.

prompt

string

required

The input prompt.

policy

dict

Inline policy configuration. Specify any combination of sampling, steering, constraints, and verification settings.

policy_id

string

ID of a previously saved policy. Mutually exclusive with policy.

max_new_tokens

integer

default:"128"

Maximum tokens to generate.

include_trace

boolean

default:"false"

If true, includes step-by-step trace data in the response.

Returns: A dict with output, accepted (bool), score, latency_ms, tokens, and optionally trace.

import mechanex as mx

mx.set_key("ax_your_key_here")

result = mx.policy.run(
    prompt="What is the capital of France?",
    policy={
        "sampling": {"method": "greedy"},
        "steering": {"preset": "truthfulness", "strength": 1.2},
    },
)
print(result["output"])

Save & Reuse

# Save a policy for reuse
policy_id = mx.policy.save({
    "name": "strict-json-output",
    "sampling": {"method": "guided-generation"},
    "constraints": {"json_schema": {"type": "object", "properties": {"answer": {"type": "string"}}}},
})

# Run with the saved policy
result = mx.policy.run("Extract the key fact.", policy_id=policy_id)

# List all saved policies
policies = mx.policy.list()

# Retrieve a specific policy
policy = mx.policy.get(policy_id)

Compare & Evaluate

Test multiple policies against the same prompt to find the best configuration:

# Compare two policies side-by-side
comparison = mx.policy.compare(
    prompt="Explain quantum computing.",
    policies=[
        {"sampling": {"method": "greedy"}},
        {"sampling": {"method": "top-p"}, "steering": {"preset": "brevity"}},
    ],
)

# Batch evaluation across multiple prompts
evaluation = mx.policy.evaluate(
    prompts=["What is 2+2?", "Name the largest planet.", "Define entropy."],
    policy_id=policy_id,
)
print(f"Success rate: {evaluation['success_rate']}")

Policy Structure

A policy bundles five configuration sections:

policy = {
    "name": "my-policy",
    "sampling": {
        "method": "guided-generation",
        "temperature": 0.3,
        # top_k, top_p, min_p, ads_subset_size, ads_beta, etc.
    },
    "steering": {
        "enabled": True,
        "preset": "truthfulness",    # or vector_id for a custom vector
        "strength": 1.2,
    },
    "constraints": {
        "json_schema": {"type": "object", "properties": {"answer": {"type": "string"}}},
        "regex_pattern": None,       # regex alternative to json_schema
        "grammar": None,             # grammar alternative
        "required_fields": [],       # fields that must appear in JSON output
        "forbidden_terms": [],       # terms that must not appear in output
    },
    "verifiers": {
        "enabled": ["json_schema", "syntax"],
        "repair_on_failure": True,
        "max_retries": 1,
    },
    "optimization": {
        "best_of_n": 1,
        "retry_on_failure": 1,
        "confidence_triggered_regeneration": False,
        "confidence_threshold": 0.5,
    },
}

Verifiers

Verifiers validate generated output and optionally repair failures by retrying. Enable them by name in the verifiers.enabled list:

Verifier	Validates
`syntax`	Output is syntactically valid (parseable)
`json_schema`	Output conforms to the `constraints.json_schema`
`regex`	Output matches `constraints.regex_pattern`
`code_compile`	Generated code compiles without errors
`unit_tests`	Generated code passes provided unit tests
`factuality`	Output is factually consistent with the prompt
`tool_args`	Tool call arguments match the expected schema

result = mx.policy.run(
    prompt="Extract the key facts as JSON.",
    policy={
        "sampling": {"method": "guided-generation"},
        "constraints": {
            "json_schema": {"type": "object", "properties": {"facts": {"type": "array"}}},
        },
        "verifiers": {
            "enabled": ["json_schema", "syntax"],
            "repair_on_failure": True,
            "max_retries": 2,
        },
    },
)
print(result["accepted"])  # True if all verifiers passed

For code generation with unit tests:

result = mx.policy.run(
    prompt="Write a Python function that reverses a string.",
    policy={
        "verifiers": {
            "enabled": ["code_compile", "unit_tests"],
            "code_language": "python",
            "code_unit_tests": [
                "assert reverse_string('hello') == 'olleh'",
                "assert reverse_string('') == ''",
            ],
            "unit_test_timeout_ms": 2000,
        },
    },
)

Optimization

Parameter	Default	Description
`best_of_n`	1	Generate N candidates and return the best
`retry_on_failure`	1	Retry count when verifiers fail
`confidence_triggered_regeneration`	false	Regenerate if model confidence falls below threshold
`confidence_threshold`	0.5	Minimum confidence score to accept output

Auto-Tune

Automatically search for the best policy parameters:

best = mx.policy.auto_tune(
    prompts=["Generate a JSON response for...", "Extract the key facts from..."],
    base_policy={"sampling": {"method": "guided-generation"}},
    max_trials=10,
)
print(best["best_policy"])

Presets

Use built-in preset builders for common configurations:

# Strict JSON extraction
policy = mx.policy.strict_json_extraction(schema=my_schema)

# Fast tool routing
policy = mx.policy.fast_tool_router()

# Diverse chatbot
policy = mx.policy.diverse_chatbot()

# Ensemble voting across models
policy = mx.policy.ensemble_vote(models=["model-a", "model-b"])

# Steering preset
policy = mx.policy.steering_preset(preset="truthfulness", sampling_method="greedy")

Presets can be published for sharing or cloned from public presets:

# Publish a preset
preset_id = mx.policy.publish_preset(name="My Config", policy=policy, visibility="public")

# Browse and clone
presets = mx.policy.list_presets(include_public=True)
cloned_id = mx.policy.clone_preset(preset_id)

Getting Started

SDK Reference

CLI

Reusable inference policies

Overview

`policy.run()`

Save & Reuse

Compare & Evaluate

Policy Structure

Verifiers

Optimization

Auto-Tune

Presets

Getting Started

SDK Reference

CLI

Documentation Index

​Overview

​policy.run()

​Save & Reuse

​Compare & Evaluate

​Policy Structure

​Verifiers

​Optimization

​Auto-Tune

​Presets

Overview

`policy.run()`

Save & Reuse

Compare & Evaluate

Policy Structure

Verifiers

Optimization

Auto-Tune

Presets