Skip to main content
This guide covers best practices for maintaining your Osmosis-synced repository and troubleshooting common issues.

Documentation

Write Clear Docstrings

All functions should have comprehensive docstrings:
@mcp.tool()
def fetch_user_data(user_id: str, include_history: bool = False) -> dict:
    """
    Fetch user profile data from the database.

    This tool retrieves comprehensive user information including
    profile details and optionally their activity history.

    Args:
        user_id: Unique identifier for the user (UUID format)
        include_history: Whether to include activity logs (default: False)

    Returns:
        Dictionary with keys:
        - id: User identifier
        - name: Full name
        - email: Email address
        - history: Activity logs (if include_history=True)

    Raises:
        ValueError: If user_id format is invalid
        LookupError: If user_id not found in database

    Example:
        >>> fetch_user_data("123e4567-e89b-12d3-a456-426614174000")
        {'id': '123e...', 'name': 'John Doe', 'email': 'john@example.com'}
    """
    # Implementation
    pass

Include Type Hints

Type hints improve IDE support and validation:
from typing import Optional, Union, List, Dict

@osmosis_reward
def evaluate_response(
    solution_str: str,
    ground_truth: str,
    extra_info: Optional[Dict[str, any]] = None,
    **kwargs  # Required - do not omit!
) -> float:
    """Type hints make the function signature clear"""
    pass

Document Expected Formats

Clearly specify input/output formats:
@osmosis_reward
def json_match_reward(
    solution_str: str,
    ground_truth: str,
    extra_info: dict = None,
    **kwargs
) -> float:
    """
    Compare JSON outputs for structural matching.

    Expected format for solution_str and ground_truth:
    {
        "answer": "the answer text",
        "confidence": 0.95,
        "sources": ["source1", "source2"]
    }

    Returns 1.0 for perfect match, 0.0 for no match.
    Partial credit given for matching some fields.
    """
    pass

Testing

Write Unit Tests

Create comprehensive tests for your functions:
# tests/test_reward_functions.py
import pytest
from reward_fn.compute_reward import numbers_match_reward

def test_exact_match():
    """Test exact numerical match"""
    score = numbers_match_reward("#### 42", "42")
    assert score == 1.0

def test_close_match():
    """Test near-match within epsilon"""
    score = numbers_match_reward("#### 42.0000001", "42")
    assert score == 1.0

def test_mismatch():
    """Test completely different values"""
    score = numbers_match_reward("#### 100", "42")
    assert score == 0.0

def test_invalid_format():
    """Test handling of invalid input format"""
    score = numbers_match_reward("no number here", "42")
    assert score == 0.0

def test_missing_solution():
    """Test handling of empty solution"""
    score = numbers_match_reward("", "42")
    assert score == 0.0

@pytest.mark.parametrize("solution,ground_truth,expected", [
    ("#### 1", "1", 1.0),
    ("#### 0", "0", 1.0),
    ("#### -5", "-5", 1.0),
    ("#### 3.14159", "3.14159", 1.0),
])
def test_various_numbers(solution, ground_truth, expected):
    """Test various number formats"""
    score = numbers_match_reward(solution, ground_truth)
    assert score == expected

Test MCP Tools Locally

Before pushing, test your MCP server:
# mcp/test/test.py
import requests
import json

def test_health_endpoint():
    """Test that server is running"""
    response = requests.get("http://localhost:8080/health")
    assert response.status_code == 200
    assert response.json()["status"] == "healthy"

def test_multiply_tool():
    """Test the multiply tool"""
    # Test with FastMCP's tool calling interface
    payload = {
        "tool": "multiply",
        "arguments": {
            "first_val": 2.5,
            "second_val": 4.0
        }
    }
    response = requests.post("http://localhost:8080/call_tool", json=payload)
    assert response.status_code == 200
    result = response.json()
    assert result["result"] == 10.0

if __name__ == "__main__":
    test_health_endpoint()
    test_multiply_tool()
    print("All tests passed!")
Run tests:
# Start server in background
python mcp/main.py &
SERVER_PID=$!

# Run tests
python mcp/test/test.py

# Stop server
kill $SERVER_PID

Use Test Fixtures

Create reusable test data:
# tests/conftest.py
import pytest

@pytest.fixture
def sample_solution():
    return "The answer is 42. #### 42"

@pytest.fixture
def sample_ground_truth():
    return "42"

@pytest.fixture
def sample_extra_info():
    return {
        "metadata": {
            "difficulty": "easy",
            "category": "arithmetic"
        }
    }

# tests/test_with_fixtures.py
def test_with_fixtures(sample_solution, sample_ground_truth):
    score = numbers_match_reward(sample_solution, sample_ground_truth)
    assert score == 1.0

CI/CD Integration

GitHub Actions Workflow

Create .github/workflows/test.yml:
name: Test and Validate

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -e .
          pip install pytest pytest-cov

      - name: Run tests
        run: |
          pytest tests/ -v --cov=. --cov-report=term-missing

      - name: Lint code
        run: |
          pip install ruff
          ruff check .

      - name: Type check
        run: |
          pip install mypy
          mypy mcp/ reward_fn/ reward_rubric/

      - name: Test MCP server
        run: |
          python mcp/main.py &
          sleep 5
          python mcp/test/test.py
          pkill -f "python mcp/main.py"

Pre-commit Hooks

Create .pre-commit-config.yaml:
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files
      - id: check-json

  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.1.9
    hooks:
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]

  - repo: https://github.com/psf/black
    rev: 23.12.1
    hooks:
      - id: black
Install pre-commit:
pip install pre-commit
pre-commit install

Security

Never Commit Secrets

Use environment variables for sensitive data:
# Good
import os
API_KEY = os.getenv("OPENAI_API_KEY")

# Bad - NEVER do this
API_KEY = "sk-proj-1234567890abcdef"

Use .gitignore

Ensure .gitignore includes:
# Environment variables
.env
.env.local
.env.*.local

# API keys and secrets
secrets.json
credentials.json
*.key
*.pem

# Python
__pycache__/
*.pyc
venv/
*.egg-info/

Review Permissions Carefully

When connecting private repos:
  • Grant minimal required permissions
  • Review which repositories Osmosis can access
  • Use deploy keys for specific repo access
  • Regularly audit connected integrations

Validate Inputs

Always validate and sanitize inputs:
@mcp.tool()
def execute_query(query: str) -> dict:
    """
    Execute a database query (with validation)
    """
    # Validate input
    if not query or not isinstance(query, str):
        raise ValueError("Query must be a non-empty string")

    # Sanitize - prevent SQL injection
    if any(keyword in query.upper() for keyword in ['DROP', 'DELETE', 'TRUNCATE']):
        raise ValueError("Destructive operations not allowed")

    # Execute safely
    return safe_execute(query)

Code Organization

Keep Functions Focused

Each function should have a single, clear purpose:
# Good - focused functions
@mcp.tool()
def calculate_average(numbers: list[float]) -> float:
    """Calculate arithmetic mean"""
    return sum(numbers) / len(numbers)

@mcp.tool()
def calculate_median(numbers: list[float]) -> float:
    """Calculate median value"""
    sorted_nums = sorted(numbers)
    n = len(sorted_nums)
    if n % 2 == 0:
        return (sorted_nums[n//2-1] + sorted_nums[n//2]) / 2
    return sorted_nums[n//2]

# Avoid - doing too much
@mcp.tool()
def analyze_numbers(numbers: list[float]) -> dict:
    """Calculate mean, median, mode, stddev, plot histogram..."""
    # Too many responsibilities
    pass

Use Helper Functions

Break complex logic into smaller pieces:
# Helper functions (not decorated - not exposed as tools)
def extract_number(text: str) -> Optional[float]:
    """Extract numeric value from text"""
    import re
    match = re.search(r'[-+]?\d*\.?\d+', text)
    return float(match.group()) if match else None

def normalize_score(raw_score: float, min_val: float, max_val: float) -> float:
    """Normalize score to [0, 1] range"""
    return (raw_score - min_val) / (max_val - min_val)

# Main function using helpers
@osmosis_reward
def text_numeric_reward(
    solution_str: str,
    ground_truth: str,
    extra_info: dict = None,
    **kwargs
) -> float:
    """Reward based on numeric extraction and comparison"""
    solution_num = extract_number(solution_str)
    truth_num = extract_number(ground_truth)

    if solution_num is None or truth_num is None:
        return 0.0

    difference = abs(solution_num - truth_num)
    raw_score = 1.0 / (1.0 + difference)

    return normalize_score(raw_score, 0.0, 1.0)

Organize by Feature

Structure your code logically:
mcp/
├── tools/
│   ├── __init__.py
│   ├── math/              # Math-related tools
│   │   ├── __init__.py
│   │   ├── arithmetic.py
│   │   └── statistics.py
│   ├── data/              # Data processing tools
│   │   ├── __init__.py
│   │   ├── fetch.py
│   │   └── transform.py
│   └── utils/             # Utility functions
│       ├── __init__.py
│       └── validation.py

Performance

Cache Expensive Operations

from functools import lru_cache

@lru_cache(maxsize=1000)
def expensive_computation(input_data: str) -> float:
    """Cached expensive operation"""
    # Complex calculation
    return result

@osmosis_reward
def cached_reward(solution_str, ground_truth, extra_info=None, **kwargs):
    """Uses cached helper function"""
    return expensive_computation(solution_str)

Choose Appropriate Models

For rubrics:
# Example with OpenAI
MODEL = "gpt-5"

# Example with Anthropic
MODEL = "claude-sonnet-4-5"

Batch Operations When Possible

@mcp.tool()
def batch_calculate(numbers_list: list[list[float]]) -> list[float]:
    """Process multiple calculations in one call"""
    return [sum(numbers) / len(numbers) for numbers in numbers_list]

Monitoring and Debugging

Add Logging

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@osmosis_reward
def logged_reward(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    """Reward function with logging"""
    logger.info(f"Evaluating solution: {solution_str[:50]}...")

    try:
        score = compute_score(solution_str, ground_truth)
        logger.info(f"Computed score: {score}")
        return score
    except Exception as e:
        logger.error(f"Error computing score: {e}")
        return 0.0

Track Metrics

from collections import defaultdict

metrics = defaultdict(int)

@osmosis_reward
def instrumented_reward(solution_str: str, ground_truth: str, extra_info: dict = None, **kwargs) -> float:
    """Track function calls and errors"""
    metrics['calls'] += 1

    try:
        score = compute_score(solution_str, ground_truth)
        metrics['successes'] += 1
        return score
    except Exception as e:
        metrics['errors'] += 1
        logger.error(f"Error: {e}")
        return 0.0

Troubleshooting

Sync Issues

Problem: Repository not syncing to Osmosis Solutions:
  • Verify folder structure matches exactly (case-sensitive)
  • Check webhook settings in GitHub repository settings
  • Review Osmosis sync logs for specific errors
  • Ensure pyproject.toml includes all dependencies
  • Validate decorators are spelled correctly

Tool Discovery Issues

Problem: MCP tools not appearing in Osmosis Solutions:
  • Confirm @mcp.tool() decorator is present
  • Check tools are exported in mcp/tools/__init__.py:
    from .math import multiply
    __all__ = ['multiply']
    
  • Verify type hints exist for all parameters and return values
  • Ensure no syntax errors in tool files
  • Check Osmosis platform logs for import errors

Reward Function Issues

Problem: Reward functions returning unexpected scores Solutions:
  • Test locally with sample inputs
  • Add print statements or logging
  • Verify input format matches expectations
  • Check error handling catches all edge cases
  • Ensure return type is float

Rubric Evaluation Issues

Problem: Rubric scores inconsistent or errors Solutions:
  • Verify API key is set correctly
  • Check API key has sufficient credits/quota
  • Test with simpler rubric first
  • Add error handling around evaluate_rubric call
  • Use return_details=True to see evaluation reasoning
  • Verify model name is correct for provider

Import Errors

Problem: ModuleNotFoundError or import failures Solutions:
  • Ensure all directories have __init__.py files
  • Verify imports use correct paths
  • Check dependencies are installed: pip install -e .
  • Use absolute imports from package root
  • Verify virtual environment is activated

Next Steps

Example Repository

Study the complete reference implementation

Python SDK

Learn more about the Python SDK

Contact Support

Get help from the Osmosis team

Setup Guide

Review the setup process