LiteLLM Integration
Integrate Martian with LiteLLM to add an additional layer of routing, load balancing, and spend tracking on top of Martian's 200+ models.
Ensure you have your Martian API key from the Martian Dashboard before continuing.
Overview
LiteLLM is a popular open-source library that provides:
- LiteLLM Proxy Server: An LLM Gateway for centralized access control, spend tracking, and rate limiting
- LiteLLM Python SDK: Python client for load balancing and cost tracking across multiple LLMs
By combining LiteLLM with Martian, you get:
- Access to Martian's 200+ models
- LiteLLM's routing, load balancing, and fallback logic
- Centralized spend tracking and rate limiting
- Virtual keys for multi-project access control
LiteLLM Python SDK
Use LiteLLM's Python SDK directly in your code for routing and load balancing across Martian's models.
Installation
pip install litellm
Basic Usage
from litellm import completion
import os
os.environ["MARTIAN_API_KEY"] = "your-martian-api-key"
# Using OpenAI-compatible endpoint
response = completion(
model="openai/openai/gpt-5:cheap",
api_base="https://api.withmartian.com/v1",
api_key=os.environ["MARTIAN_API_KEY"],
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Router for Load Balancing
from litellm import Router
import os
router = Router(
model_list=[
{
"model_name": "martian-gpt-5",
"litellm_params": {
"model": "openai/openai/gpt-5:cheap",
"api_base": "https://api.withmartian.com/v1",
"api_key": os.environ["MARTIAN_API_KEY"]
}
},
{
"model_name": "martian-sonnet-4",
"litellm_params": {
"model": "anthropic/anthropic/claude-sonnet-4-20250514:cheap",
"api_base": "https://api.withmartian.com/v1",
"api_key": os.environ["MARTIAN_API_KEY"]
}
}
]
)
response = router.completion(
model="martian-gpt-5",
messages=[{"role": "user", "content": "Hello!"}]
)
See the LiteLLM Router documentation for advanced routing and load balancing features.
Understanding the Model Name Format
The model name format is critical when using LiteLLM with Martian: {litellm-provider}/{martian-provider}/{model-name}
Example: openai/openai/gpt-5:cheap
- First
openai/
: Tells LiteLLM to use the OpenAI-compatible endpoint format (Chat Completions) - Second
openai/
: Tells Martian which provider the model belongs to gpt-5:cheap
: The actual model name with optional:cheap
suffix for cost optimization
Provider prefix selection:
- Name
Use openai/ prefix
- Description
For models that work with Chat Completions endpoint (
/v1/chat/completions
)Examples:
openai/openai/gpt-5
openai/google/gemini-2.5-flash
openai/meta-llama/llama-3.3-70b-instruct
- Name
Use anthropic/ prefix
- Description
For models that work with Messages endpoint (
/v1/messages
)Examples:
anthropic/anthropic/claude-sonnet-4-20250514
anthropic/openai/gpt-5
LiteLLM Proxy Server
Run LiteLLM as a centralized proxy server for access control, spend tracking, and rate limiting.
Installation
docker pull ghcr.io/berriai/litellm:main-latest
Configuration File
Create a config.yaml
file to configure LiteLLM Proxy with Martian models:
model_list:
# GPT-5 via Martian with cost optimization
- model_name: martian-gpt-5
litellm_params:
model: openai/openai/gpt-5:cheap
api_base: https://api.withmartian.com/v1
api_key: "os.environ/MARTIAN_API_KEY"
# Claude Sonnet 4 via Martian with cost optimization
- model_name: martian-sonnet-4
litellm_params:
model: anthropic/anthropic/claude-sonnet-4-20250514:cheap
api_base: https://api.withmartian.com/v1
api_key: "os.environ/MARTIAN_API_KEY"
# Gemini 2.5 Flash via Martian
- model_name: martian-gemini-flash
litellm_params:
model: openai/google/gemini-2.5-flash
api_base: https://api.withmartian.com/v1
api_key: "os.environ/MARTIAN_API_KEY"
general_settings:
master_key: sk-1234 # Your LiteLLM admin key
Important: The provider must be specified twice in the model name:
- First provider (e.g.,
openai/
oranthropic/
) tells LiteLLM which endpoint type to use - Second provider (e.g.,
/openai/
or/anthropic/
) tells Martian which provider the model is from
Format: {litellm-provider}/{martian-provider}/{model-name}
Provider Mapping
Use the correct LiteLLM provider prefix based on the API format:
- Name
openai/
- Description
Use for Chat Completions endpoint (
/v1/chat/completions
)Examples:
openai/openai/gpt-5
openai/google/gemini-2.5-flash
openai/meta-llama/llama-3.3-70b-instruct
- Name
anthropic/
- Description
Use for Messages endpoint (
/v1/messages
)Examples:
anthropic/anthropic/claude-sonnet-4-20250514
anthropic/openai/gpt-4.1-nano
See the LiteLLM OpenAI Compatible Providers and LiteLLM Anthropic Provider documentation for more details.
Start the Proxy
docker run \
-v $(pwd)/config.yaml:/app/config.yaml \
-e MARTIAN_API_KEY=your-martian-api-key \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml
# Proxy running on http://0.0.0.0:4000
Make Requests
Once the proxy is running, make requests using the OpenAI SDK:
import openai
client = openai.OpenAI(
api_key="sk-1234", # Your LiteLLM master key
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="martian-gpt-5", # Use the model_name from config
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Useful Resources
- Name
LiteLLM Documentation
- Description
Official LiteLLM Docs - Complete documentation for LiteLLM Proxy and SDK
- Name
OpenAI Compatible Provider
- Description
OpenAI Compatible Docs - For Chat Completions endpoint
- Name
Anthropic Provider
- Description
Anthropic Provider Docs - For Messages endpoint
- Name
Router & Load Balancing
- Description
Router SDK Docs - For load balancing with Python SDK
- Name
Virtual Keys
- Description
Key Management Docs - For spend tracking and rate limiting