Switching To The Router

Switch To The Router

In order to use the router, just change the model field to router -- we'll find the right model for you!

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

By default, we route to the model which will give the highest performance on your specific prompt. You can read about how we do that here.

By routing to the model with the highest performance, instead of choosing a single model, we're able to outperform GPT-4 on the evaluation set OpenAI uses (openai/evals). You can reproduce those results in this colab notebook.

Set Routing Parameters

Once you switch to the router, you can control the criteria used for routing.

For example, if you want to route between a specific set of models, you can include them as a list in the models field. We'll only route between the models you specify in that field.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "models": ["gpt-3.5-turbo", "claude-2.1"],
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

If you don't care only about performance, but also about other factors such as cost, you can specify things like the max_cost for your request or your willingness_to_pay i.e. how many dollars you are willing to pay for a 10% better answer on this request.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "router",
  "models": ["gpt-3.5-turbo", "claude-2.1"],
  "max_cost": 0.02,
  "willingness_to_pay": 0.01,
  "messages": [
    {
      "role": "user",
      "content": "Hello world!"
    }
  ],
  "temperature": 1
}'

These parameters let you tightly control your unit economics. For example, if you have a free product, you can use the max_cost parameter to calculate your CAC. Or, if you know how much having a correct answer in your product is worth, you can set the willingness_to_pay parameter and get answers that perfectly balance cost and performance.

Finally, you can further capture any arbitrary metadata on each request as long as you set the optional extra parameter.

curl https://withmartian.com/api/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_MARTIAN_API_KEY>" \
  -d '{
  "model": "gpt-4",
  "messages": [
    {
      "role": "user",
      "content": "Write small poem"
    }
  ],
  "extra": {
    "ip": "123.123.123.123",
    "Timezone": "UTC+0",
    "Country": "US",
    "City": "New York"
  }
}'

Other Benefits Of The Router

In addition to providing higher performance and lower cost, the router provides several other benefits. The router lets you:

  • Use fallbacks for higher up-time. Instead of using a single model or provider, which could go down at any time, using the router lets you switch to other providers in the event that one is down.

  • Future proof your tech. When new models come out, they're automatically added to the router. That way, you get the latest and greatest in performance. If you're a startup or enterprise company that wants to access specific router versions for stable releases, contact us: contact@withmartian.com

  • Focus on product. Testing every single model and reading every single research paper to get the best in performance -- that requires hiring an entire engineering team. With the router, you can stay ahead of your competition and stop fiddling with models. Focus on your specialty: building a product your customers love.

Last updated