LiteLLM Integration

Integrate Martian with LiteLLM to add an additional layer of routing, load balancing, and spend tracking on top of Martian's 200+ models.

Ensure you have your Martian API key from the Martian Dashboard before continuing.

Overview

LiteLLM is a popular open-source library that provides:

LiteLLM Proxy Server: An LLM Gateway for centralized access control, spend tracking, and rate limiting
LiteLLM Python SDK: Python client for load balancing and cost tracking across multiple LLMs

By combining LiteLLM with Martian, you get:

Access to Martian's 200+ models
LiteLLM's routing, load balancing, and fallback logic
Centralized spend tracking and rate limiting
Virtual keys for multi-project access control

LiteLLM Python SDK

Use LiteLLM's Python SDK directly in your code for routing and load balancing across Martian's models.

Installation

pip install litellm

Basic Usage

from litellm import completion
import os

os.environ["MARTIAN_API_KEY"] = "your-martian-api-key"

# Using OpenAI-compatible endpoint
response = completion(
    model="openai/openai/gpt-5:cheap",
    api_base="https://api.withmartian.com/v1",
    api_key=os.environ["MARTIAN_API_KEY"],
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Router for Load Balancing

from litellm import Router
import os

router = Router(
    model_list=[
        {
            "model_name": "martian-gpt-5",
            "litellm_params": {
                "model": "openai/openai/gpt-5:cheap",
                "api_base": "https://api.withmartian.com/v1",
                "api_key": os.environ["MARTIAN_API_KEY"]
            }
        },
        {
            "model_name": "martian-sonnet-4",
            "litellm_params": {
                "model": "anthropic/anthropic/claude-sonnet-4-20250514:cheap",
                "api_base": "https://api.withmartian.com/v1",
                "api_key": os.environ["MARTIAN_API_KEY"]
            }
        }
    ]
)

response = router.completion(
    model="martian-gpt-5",
    messages=[{"role": "user", "content": "Hello!"}]
)

See the LiteLLM Router documentation for advanced routing and load balancing features.

Understanding the Model Name Format

The model name format is critical when using LiteLLM with Martian: {litellm-provider}/{martian-provider}/{model-name}

Example: openai/openai/gpt-5:cheap

First openai/: Tells LiteLLM to use the OpenAI-compatible endpoint format (Chat Completions)
Second openai/: Tells Martian which provider the model belongs to
gpt-5:cheap: The actual model name with optional :cheap suffix for cost optimization

Provider prefix selection:

Name
Use openai/ prefix
Description
For models that work with Chat Completions endpoint (/v1/chat/completions)
Examples:
- openai/openai/gpt-5
- openai/google/gemini-2.5-flash
- openai/meta-llama/llama-3.3-70b-instruct
See LiteLLM OpenAI Compatible docs
Name
Use anthropic/ prefix
Description
For models that work with Messages endpoint (/v1/messages)
Examples:
- anthropic/anthropic/claude-sonnet-4-20250514
- anthropic/openai/gpt-5
See LiteLLM Anthropic Provider docs

LiteLLM Proxy Server

Run LiteLLM as a centralized proxy server for access control, spend tracking, and rate limiting.

Installation

docker pull ghcr.io/berriai/litellm:main-latest

Configuration File

Create a config.yaml file to configure LiteLLM Proxy with Martian models:

model_list:
  # GPT-5 via Martian with cost optimization
  - model_name: martian-gpt-5
    litellm_params:
      model: openai/openai/gpt-5:cheap
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

  # Claude Sonnet 4 via Martian with cost optimization
  - model_name: martian-sonnet-4
    litellm_params:
      model: anthropic/anthropic/claude-sonnet-4-20250514:cheap
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

  # Gemini 2.5 Flash via Martian
  - model_name: martian-gemini-flash
    litellm_params:
      model: openai/google/gemini-2.5-flash
      api_base: https://api.withmartian.com/v1
      api_key: "os.environ/MARTIAN_API_KEY"

general_settings:
  master_key: sk-1234  # Your LiteLLM admin key

Important: The provider must be specified twice in the model name:

First provider (e.g., openai/ or anthropic/) tells LiteLLM which endpoint type to use
Second provider (e.g., /openai/ or /anthropic/) tells Martian which provider the model is from

Format: {litellm-provider}/{martian-provider}/{model-name}

Provider Mapping

Use the correct LiteLLM provider prefix based on the API format:

Name
openai/
Description
Use for Chat Completions endpoint (/v1/chat/completions)
Examples:
- openai/openai/gpt-5
- openai/google/gemini-2.5-flash
- openai/meta-llama/llama-3.3-70b-instruct
Name
anthropic/
Description
Use for Messages endpoint (/v1/messages)
Examples:
- anthropic/anthropic/claude-sonnet-4-20250514
- anthropic/openai/gpt-4.1-nano

See the LiteLLM OpenAI Compatible Providers and LiteLLM Anthropic Provider documentation for more details.

Start the Proxy

docker run \
    -v $(pwd)/config.yaml:/app/config.yaml \
    -e MARTIAN_API_KEY=your-martian-api-key \
    -p 4000:4000 \
    ghcr.io/berriai/litellm:main-latest \
    --config /app/config.yaml

# Proxy running on http://0.0.0.0:4000

Make Requests

Once the proxy is running, make requests using the OpenAI SDK:

import openai

client = openai.OpenAI(
    api_key="sk-1234",  # Your LiteLLM master key
    base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
    model="martian-gpt-5",  # Use the model_name from config
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Useful Resources

Name
LiteLLM Documentation
Description
Official LiteLLM Docs - Complete documentation for LiteLLM Proxy and SDK
Name
OpenAI Compatible Provider
Description
OpenAI Compatible Docs - For Chat Completions endpoint
Name
Anthropic Provider
Description
Anthropic Provider Docs - For Messages endpoint
Name
Router & Load Balancing
Description
Router SDK Docs - For load balancing with Python SDK
Name
Virtual Keys
Description
Key Management Docs - For spend tracking and rate limiting

Next Steps

View Available Models

Browse 200+ AI models from leading providers with real-time pricing.

OpenAI SDK Integration

Use the OpenAI SDK directly with Martian without LiteLLM.

View Other Integrations

Explore other ways to integrate Martian with your development workflow.