Hello World!

Let's walk through a simple example of using the Martian Model Router. This example will illustrate a few features:

  • How we can use the model gateway to transition to a router

  • How we can use a router to choose what model we should use

  • How we benefit from using a router

Get a Martian API Key

The first step to getting started is

  1. Log in and create an organization

  2. Create an API Key

Make Your First Request

Using Martian is as simple as making an API call in your terminal.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

This is identical to making API calls to the OpenAI API, with some caveats noted here.

Benchmarking the Router

There are two benefits of making calls in this way. Firstly, it serves as a model gateway, so you can access models from other providers. For example, if we wanted to instead use Anthropic, we can simply change the model.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "claude-2.1",
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

But the deeper benefit is that it allows you to safely and seamlessly switch to using a router. You can first integrate Martian by making calls to the model you were using before. Once you use the model enough, we send you a detailed report of the performance of the router, with both qualitative examples and quantitative performance statistics. That lets you switch to the router once you're confident in the performance of the router.

You can learn more about our patent-pending benchmarking here.

Switch To The Router

In order to use the router, just change the model field to router -- we'll find the right model for you!

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

By default, we route to the model which will give the highest performance on your specific prompt. You can read about how we do that here.

By routing to the model with the highest performance, instead of choosing a single model, we're able to outperform GPT-4 on the evaluation set OpenAI uses (openai/evals). You can reproduce those results in this colab notebook.

Set Routing Parameters

Once you switch to the router, you can control the criteria used for routing.

For example, if you want to route between a specific set of models, you can include them as a list in the models field. We'll only route between the models you specify in that field.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "models": ["gpt-3.5-turbo", "claude-2.1"],
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

If you don't care only about performance, but also about other factors such as cost, you can specify things like the max_cost for your request or your willingness_to_pay i.e. how many dollars you are willing to pay for a 10% better answer on this request.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "models": ["gpt-3.5-turbo", "claude-2.1"],
  "max_cost": 0.02,
  "willingness_to_pay": 0.01,
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

These parameters let you tightly control your unit economics. For example, if you have a free product, you can use the max_cost parameter to calculate your CAC. Or, if you know how much having a correct answer in your product is worth, you can set the willingness_to_pay parameter and get answers that perfectly balance cost and performance.

Finally, you can further capture any arbitrary metadata on each request as long as you set the optional extra parameter.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "gpt-4",
  "messages": [
    {
      "role": "user",
      "content": "Write small poem"
    }
  ],
  "extra": {
    "ip": "123.123.123.123",
    "Timezone": "UTC+0",
    "Country": "US",
    "City": "New York"
  }
}'

Other Benefits Of The Router

In addition to providing higher performance and lower cost, the router provides several other benefits. The router lets you:

  • Use fallbacks for higher up-time. Instead of using a single model or provider, which could go down at any time, using the router lets you switch to other providers in the event that one is down.

  • Future proof your tech. When new models come out, they're automatically added to the router. That way, you get the latest and greatest in performance. If you're a startup or enterprise company that wants to access specific router versions for stable releases, contact us: contact@withmartian.com

  • Focus on product. Testing every single model and reading every single research paper to get the best in performance -- that requires hiring an entire engineering team. With the router, you can stay ahead of your competition and stop fiddling with models. Focus on your specialty: building a product your customers love.

Last updated