API Reference

Complete reference for the osmosis-ai Python SDK.

Decorators

@osmosis_reward

Decorator for local reward functions that compute scores without API calls. Signature:

@osmosis_reward
def function_name(
    solution_str: str,
    ground_truth: str,
    extra_info: dict = None,
    **kwargs
) -> float

Parameters:

solution_str (str, required) - Text to evaluate
ground_truth (str, required) - Reference answer
extra_info (dict, optional) - Additional context
**kwargs (required) - Future compatibility (see warning below)

Returns: float - Score value Example:

from osmosis_ai import osmosis_reward

@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0

@osmosis_rubric

Decorator for LLM-based evaluation functions. Signature:

@osmosis_rubric
def function_name(
    solution_str: str,
    ground_truth: str | None,
    extra_info: dict,
    **kwargs
) -> float

Parameters:

solution_str (str, required) - Text to evaluate
ground_truth (str | None, required) - Reference answer (can be None)
extra_info (dict, required) - Configuration and context
**kwargs (required) - Future compatibility (see warning below)

Returns: float - Score value Example:

from osmosis_ai import osmosis_rubric, evaluate_rubric

@osmosis_rubric
def quality_check(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
    return evaluate_rubric(
        rubric="Evaluate response quality",
        solution_str=solution_str,
        model_info={"provider": "openai", "model": "gpt-5"},
        ground_truth=ground_truth
    )

Core Functions

evaluate_rubric()

Evaluate text using an LLM-based rubric. Signature:

def evaluate_rubric(
    rubric: str,
    solution_str: str,
    model_info: dict,
    ground_truth: str | None = None,
    original_input: str | None = None,
    metadata: dict | None = None,
    score_min: float = 0.0,
    score_max: float = 1.0,
    timeout: int | None = None,
    return_details: bool = False
) -> float | dict

Parameters:

Parameter	Type	Required	Description
`rubric`	str	Yes	Natural language evaluation criteria
`solution_str`	str	Yes	Text to evaluate
`model_info`	dict	Yes	LLM provider configuration
`ground_truth`	str	No	Reference answer
`original_input`	str	No	Original user query
`metadata`	dict	No	Additional context
`score_min`	float	No	Minimum score (default: 0.0)
`score_max`	float	No	Maximum score (default: 1.0)
`timeout`	int	No	Request timeout in seconds
`return_details`	bool	No	Return full response (default: False)

model_info Structure:

{
    "provider": "openai",           # Required
    "model": "gpt-5",         # Required
    "api_key": "sk-...",            # Optional
    "api_key_env": "OPENAI_API_KEY", # Optional
    "timeout": 30                   # Optional
}

Returns:

float - Score (when return_details=False)
dict - Full response with score, explanation, raw payload (when return_details=True)

Example:

from osmosis_ai import evaluate_rubric

score = evaluate_rubric(
    rubric="Evaluate how helpful the response is.",
    solution_str="Click 'Forgot Password' to reset.",
    model_info={"provider": "openai", "model": "gpt-5"}
)

Exceptions

MissingAPIKeyError

Raised when an API key is not found for a provider.

from osmosis_ai import MissingAPIKeyError

try:
    score = evaluate_rubric(...)
except MissingAPIKeyError as e:
    print(f"API key not found: {e}")

ProviderRequestError

Raised when a provider request fails.

from osmosis_ai import ProviderRequestError

try:
    score = evaluate_rubric(...)
except ProviderRequestError as e:
    print(f"Provider error: {e}")

ModelNotFoundError

Raised when a specified model is not available (subclass of ProviderRequestError).

from osmosis_ai import ModelNotFoundError

try:
    score = evaluate_rubric(...)
except ModelNotFoundError as e:
    print(f"Model not found: {e}")

Type Definitions

ModelInfo (TypedDict)

from osmosis_ai import ModelInfo

model_info: ModelInfo = {
    "provider": "openai",
    "model": "gpt-5",
    "api_key_env": "OPENAI_API_KEY",
    "timeout": 30
}

RewardRubricRunResult (TypedDict)

Returned when return_details=True:

{
    "score": 0.85,              # float
    "explanation": "...",       # str
    "raw_payload": {...}        # dict
}

Complete Example

from osmosis_ai import osmosis_reward, osmosis_rubric, evaluate_rubric
from dotenv import load_dotenv

load_dotenv()

# Local reward function
@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0

# Remote rubric evaluator
@osmosis_rubric
def semantic_eval(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
    return evaluate_rubric(
        rubric="Compare semantic similarity (0-1 scale)",
        solution_str=solution_str,
        ground_truth=ground_truth,
        model_info={"provider": "openai", "model": "gpt-5"}
    )

# Usage
solution = "The capital of France is Paris"
truth = "Paris is France's capital"

local_score = exact_match(solution, truth)
semantic_score = semantic_eval(solution, truth, {})

print(f"Exact match: {local_score}")      # 0.0
print(f"Semantic: {semantic_score}")      # ~1.0

Next Steps

Quick Start

Learn with examples

Decorators & API Guide

Advanced patterns

CLI Reference

Batch evaluations

Python SDK

​API Reference

​Decorators

​@osmosis_reward

​@osmosis_rubric

​Core Functions

​evaluate_rubric()

​Exceptions

​MissingAPIKeyError

​ProviderRequestError

​ModelNotFoundError

​Type Definitions

​ModelInfo (TypedDict)

​RewardRubricRunResult (TypedDict)

​Complete Example

​Next Steps

Quick Start

Decorators & API Guide

CLI Reference

API Reference

Decorators

@osmosis_reward

@osmosis_rubric

Core Functions

evaluate_rubric()

Exceptions

MissingAPIKeyError

ProviderRequestError

ModelNotFoundError

Type Definitions

ModelInfo (TypedDict)

RewardRubricRunResult (TypedDict)

Complete Example

Next Steps