LIVE
OPUS 4.7$15 / $75per Mtok
SONNET 4.6$3 / $15per Mtok
GPT-5.5$10 / $30per Mtok
GEMINI 3.1$3.50 / $10.50per Mtok
SWE-BENCHleader Claude Opus 4.772.1%
MMLU-PROleader Opus 4.788.4
VALS FINANCEleader Opus 4.764.4%
AFTAv1.0 whitepaper live at /whitepaper
OPUS 4.7$15 / $75per Mtok
SONNET 4.6$3 / $15per Mtok
GPT-5.5$10 / $30per Mtok
GEMINI 3.1$3.50 / $10.50per Mtok
SWE-BENCHleader Claude Opus 4.772.1%
MMLU-PROleader Opus 4.788.4
VALS FINANCEleader Opus 4.764.4%
AFTAv1.0 whitepaper live at /whitepaper
All systems operational0 AI providers monitored, polled every 2 minutes
Live status
All endpoints

Inference Cheapest

Free
GET /api/inference-providers/cheapest

Agent-friendly entry point into the inference provider matrix. Pass a canonical model id and get back the top 3 cheapest offers (default sort is blended price). Skips the full matrix payload, which is useful when an agent is just picking the cheapest path and does not need every column.

When to use this endpoint

When your agent needs the cheapest inference path for a specific open-weight model in a single call. For a different sort, pass ?sort=input|output|tps_desc.

Parameters

NameInTypeDescription
model*querystringCanonical model id (llama-4-maverick, llama-4-scout, llama-3.1-70b, llama-3.1-405b, deepseek-v4-pro, deepseek-v4-flash, mixtral-8x22b, qwen-2.5-72b)e.g. llama-4-scout
sortquerystringSort order: blended (default), input, output, tps_desce.g. tps_desc

* required

Example response

{
  "ok": true,
  "modelId": "llama-4-scout",
  "modelName": "Llama 4 Scout",
  "family": "Meta",
  "sortBy": "blended",
  "cheapest": { "provider": "DeepInfra", "blendedPrice": 0.355, "inputPrice": 0.16, "outputPrice": 0.55, "contextWindow": 10000000, "outputTPS": 170 },
  "top3": [
    { "provider": "DeepInfra", "blendedPrice": 0.355 },
    { "provider": "OpenRouter", "blendedPrice": 0.385 },
    { "provider": "Groq", "blendedPrice": 0.385 }
  ]
}

Code samples

Python SDK

from tensorfeed import TensorFeed
tf = TensorFeed()
result = tf.inference_cheapest("llama-4-scout")
print(f"Cheapest: {result['cheapest']['provider']} at ${result['cheapest']['blendedPrice']:.3f}/1M blended")

TypeScript SDK

const res = await fetch("https://tensorfeed.ai/api/inference-providers/cheapest?model=llama-4-scout");
const result = await res.json();
console.log(`Cheapest: ${result.cheapest.provider} @ $${result.cheapest.blendedPrice}`);

FAQ

What if my model is not in the matrix?

Returns 404 model_not_found. List available models at /api/inference-providers. We track the most-served open-weight models; if you need one we do not cover, the catalog is editorial and we add new models on demand.

Why is the sort default blended and not input?

Because real workloads have non-zero output usage. Blended at 1:1 input:output ratio is a better proxy for actual cost than input-only. If your workload is heavy-input or heavy-output, pass ?sort=input or ?sort=output.

Related endpoints