Documentation Index
Fetch the complete documentation index at: https://docs.gulp.ai/llms.txt
Use this file to discover all available pages before exploring further.
API Reference
Complete reference for the osmosis-ai Python SDK.
Decorators
@osmosis_reward
Decorator for local reward functions that compute scores without API calls.
Signature:
@osmosis_reward
def function_name(
solution_str: str,
ground_truth: str,
extra_info: dict = None,
**kwargs
) -> float
Parameters:
solution_str (str, required) - Text to evaluate
ground_truth (str, required) - Reference answer
extra_info (dict, optional) - Additional context
**kwargs (required) - Future compatibility (see warning below)
Returns: float - Score value
Example:
from osmosis_ai import osmosis_reward
@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0
@osmosis_rubric
Decorator for LLM-based evaluation functions.
Signature:
@osmosis_rubric
def function_name(
solution_str: str,
ground_truth: str | None,
extra_info: dict,
**kwargs
) -> float
Parameters:
solution_str (str, required) - Text to evaluate
ground_truth (str | None, required) - Reference answer (can be None)
extra_info (dict, required) - Configuration and context
**kwargs (required) - Future compatibility (see warning below)
Returns: float - Score value
Example:
from osmosis_ai import osmosis_rubric, evaluate_rubric
@osmosis_rubric
def quality_check(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
return evaluate_rubric(
rubric="Evaluate response quality",
solution_str=solution_str,
model_info={"provider": "openai", "model": "gpt-5"},
ground_truth=ground_truth
)
Core Functions
evaluate_rubric()
Evaluate text using an LLM-based rubric.
Signature:
def evaluate_rubric(
rubric: str,
solution_str: str,
model_info: dict,
ground_truth: str | None = None,
original_input: str | None = None,
metadata: dict | None = None,
score_min: float = 0.0,
score_max: float = 1.0,
timeout: int | None = None,
return_details: bool = False
) -> float | dict
Parameters:
| Parameter | Type | Required | Description |
|---|
rubric | str | Yes | Natural language evaluation criteria |
solution_str | str | Yes | Text to evaluate |
model_info | dict | Yes | LLM provider configuration |
ground_truth | str | No | Reference answer |
original_input | str | No | Original user query |
metadata | dict | No | Additional context |
score_min | float | No | Minimum score (default: 0.0) |
score_max | float | No | Maximum score (default: 1.0) |
timeout | int | No | Request timeout in seconds |
return_details | bool | No | Return full response (default: False) |
model_info Structure:
{
"provider": "openai", # Required
"model": "gpt-5", # Required
"api_key": "sk-...", # Optional
"api_key_env": "OPENAI_API_KEY", # Optional
"timeout": 30 # Optional
}
Returns:
float - Score (when return_details=False)
dict - Full response with score, explanation, raw payload (when return_details=True)
Example:
from osmosis_ai import evaluate_rubric
score = evaluate_rubric(
rubric="Evaluate how helpful the response is.",
solution_str="Click 'Forgot Password' to reset.",
model_info={"provider": "openai", "model": "gpt-5"}
)
Exceptions
MissingAPIKeyError
Raised when an API key is not found for a provider.
from osmosis_ai import MissingAPIKeyError
try:
score = evaluate_rubric(...)
except MissingAPIKeyError as e:
print(f"API key not found: {e}")
ProviderRequestError
Raised when a provider request fails.
from osmosis_ai import ProviderRequestError
try:
score = evaluate_rubric(...)
except ProviderRequestError as e:
print(f"Provider error: {e}")
ModelNotFoundError
Raised when a specified model is not available (subclass of ProviderRequestError).
from osmosis_ai import ModelNotFoundError
try:
score = evaluate_rubric(...)
except ModelNotFoundError as e:
print(f"Model not found: {e}")
Type Definitions
ModelInfo (TypedDict)
from osmosis_ai import ModelInfo
model_info: ModelInfo = {
"provider": "openai",
"model": "gpt-5",
"api_key_env": "OPENAI_API_KEY",
"timeout": 30
}
RewardRubricRunResult (TypedDict)
Returned when return_details=True:
{
"score": 0.85, # float
"explanation": "...", # str
"raw_payload": {...} # dict
}
Complete Example
from osmosis_ai import osmosis_reward, osmosis_rubric, evaluate_rubric
from dotenv import load_dotenv
load_dotenv()
# Local reward function
@osmosis_reward
def exact_match(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
return 1.0 if solution_str.strip() == ground_truth.strip() else 0.0
# Remote rubric evaluator
@osmosis_rubric
def semantic_eval(solution_str: str, ground_truth: str | None, extra_info: dict, **kwargs) -> float:
return evaluate_rubric(
rubric="Compare semantic similarity (0-1 scale)",
solution_str=solution_str,
ground_truth=ground_truth,
model_info={"provider": "openai", "model": "gpt-5"}
)
# Usage
solution = "The capital of France is Paris"
truth = "Paris is France's capital"
local_score = exact_match(solution, truth)
semantic_score = semantic_eval(solution, truth, {})
print(f"Exact match: {local_score}") # 0.0
print(f"Semantic: {semantic_score}") # ~1.0
Next Steps
Quick Start
Learn with examples
Decorators & API Guide
Advanced patterns
CLI Reference
Batch evaluations