LiteLLM Integration

Integrate Martian with LiteLLM to add an additional layer of routing, load balancing, and spend tracking on top of Martian's 200+ models.

Ensure you have your Martian API key from the Martian Dashboard before continuing.

Overview

LiteLLM is a popular open-source library that provides:

  • LiteLLM Proxy Server: An LLM Gateway for centralized access control, spend tracking, and rate limiting
  • LiteLLM Python SDK: Python client for load balancing and cost tracking across multiple LLMs

By combining LiteLLM with Martian, you get:

  • Access to Martian's 200+ models
  • LiteLLM's routing, load balancing, and fallback logic
  • Centralized spend tracking and rate limiting
  • Virtual keys for multi-project access control

LiteLLM Python SDK

Use LiteLLM's Python SDK directly in your code for routing and load balancing across Martian's models.

Installation

pip install litellm

Basic Usage

from litellm import completion
import os

os.environ["MARTIAN_API_KEY"] = "your-martian-api-key"

# Using OpenAI-compatible endpoint
response = completion(
    model="openai/openai/gpt-5:cheap",
    api_base="https://api.withmartian.com/v1",
    api_key=os.environ["MARTIAN_API_KEY"],
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Router for Load Balancing

from litellm import Router
import os

router = Router(
    model_list=[
        {
            "model_name": "martian-gpt-5",
            "litellm_params": {
                "model": "openai/openai/gpt-5:cheap",
                "api_base": "https://api.withmartian.com/v1",
                "api_key": os.environ["MARTIAN_API_KEY"]
            }
        },
        {
            "model_name": "martian-sonnet-4",
            "litellm_params": {
                "model": "anthropic/anthropic/claude-sonnet-4-20250514:cheap",
                "api_base": "https://api.withmartian.com/v1",
                "api_key": os.environ["MARTIAN_API_KEY"]
            }
        }
    ]
)

response = router.completion(
    model="martian-gpt-5",
    messages=[{"role": "user", "content": "Hello!"}]
)

See the LiteLLM Router documentation for advanced routing and load balancing features.

Understanding the Model Name Format

The model name format is critical when using LiteLLM with Martian: {litellm-provider}/{martian-provider}/{model-name}

Example: openai/openai/gpt-5:cheap

  1. First openai/: Tells LiteLLM to use the OpenAI-compatible endpoint format (Chat Completions)
  2. Second openai/: Tells Martian which provider the model belongs to
  3. gpt-5:cheap: The actual model name with optional :cheap suffix for cost optimization

Provider prefix selection:

  • Name
    Use openai/ prefix
    Description

    For models that work with Chat Completions endpoint (/v1/chat/completions)

    Examples:

    • openai/openai/gpt-5
    • openai/google/gemini-2.5-flash
    • openai/meta-llama/llama-3.3-70b-instruct

    See LiteLLM OpenAI Compatible docs

  • Name
    Use anthropic/ prefix
    Description

    For models that work with Messages endpoint (/v1/messages)

    Examples:

    • anthropic/anthropic/claude-sonnet-4-20250514
    • anthropic/openai/gpt-5

    See LiteLLM Anthropic Provider docs

LiteLLM Proxy Server

Run LiteLLM as a centralized proxy server for access control, spend tracking, and rate limiting.

Installation

docker pull ghcr.io/berriai/litellm:main-latest

Configuration File

Create a config.yaml file to configure LiteLLM Proxy with Martian models:

model_list:
  # GPT-5 via Martian with cost optimization
  - model_name: martian-gpt-5
    litellm_params:
      model: openai/openai/gpt-5:cheap
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

  # Claude Sonnet 4 via Martian with cost optimization
  - model_name: martian-sonnet-4
    litellm_params:
      model: anthropic/anthropic/claude-sonnet-4-20250514:cheap
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

  # Gemini 2.5 Flash via Martian
  - model_name: martian-gemini-flash
    litellm_params:
      model: openai/google/gemini-2.5-flash
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

general_settings:
  master_key: sk-1234  # Your LiteLLM admin key

Important: The provider must be specified twice in the model name:

  • First provider (e.g., openai/ or anthropic/) tells LiteLLM which endpoint type to use
  • Second provider (e.g., /openai/ or /anthropic/) tells Martian which provider the model is from

Format: {litellm-provider}/{martian-provider}/{model-name}

Provider Mapping

Use the correct LiteLLM provider prefix based on the API format:

  • Name
    openai/
    Description

    Use for Chat Completions endpoint (/v1/chat/completions)

    Examples:

    • openai/openai/gpt-5
    • openai/google/gemini-2.5-flash
    • openai/meta-llama/llama-3.3-70b-instruct
  • Name
    anthropic/
    Description

    Use for Messages endpoint (/v1/messages)

    Examples:

    • anthropic/anthropic/claude-sonnet-4-20250514
    • anthropic/openai/gpt-4.1-nano

See the LiteLLM OpenAI Compatible Providers and LiteLLM Anthropic Provider documentation for more details.

Start the Proxy

docker run \
    -v $(pwd)/config.yaml:/app/config.yaml \
    -e MARTIAN_API_KEY=your-martian-api-key \
    -p 4000:4000 \
    ghcr.io/berriai/litellm:main-latest \
    --config /app/config.yaml

# Proxy running on http://0.0.0.0:4000

Make Requests

Once the proxy is running, make requests using the OpenAI SDK:

import openai

client = openai.OpenAI(
    api_key="sk-1234",  # Your LiteLLM master key
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="martian-gpt-5",  # Use the model_name from config
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Useful Resources


Next Steps

View Available Models

Browse 200+ AI models from leading providers with real-time pricing.

Read more

OpenAI SDK Integration

Use the OpenAI SDK directly with Martian without LiteLLM.

Read more

View Other Integrations

Explore other ways to integrate Martian with your development workflow.

Read more