Open-Weights Deployment
What you actually need to self-host the major open-weights models. VRAM per quantization (FP16, FP8, AWQ INT4, GGUF Q4_K_M), recommended GPU class, license, capabilities. The “I want to run this myself” companion to /inference-providers (hosted pricing).
For agents: same data at /api/open-weights. Filter with ?family=Meta|DeepSeek|Alibaba|Mistral|Google|Microsoft. Free, no auth, cached 10 min.