Open-Weights Deployment

What you actually need to self-host the major open-weights models. VRAM per quantization (FP16, FP8, NVFP4, AWQ INT4, GGUF Q4_K_M), recommended GPU class, license, capabilities. This is the “I want to run this myself” side of the market. Rent it instead via /inference-providers (hosted open-weights pricing) or /models (frontier closed-model pricing per token). The closed bar these models are measured against moved on July 24, 2026: Claude Opus 5 shipped at $5/$25 per 1M tokens, holding Opus 4.8 pricing while posting 70.6 percent on OSWorld 2.0 and 90.8 on BrowseComp (vendor-reported), so the open-vs-closed capability gap widened at an unchanged closed-model price. Compare open models on capability at /best-open-source-llms.

Machine-readable JSON/api/open-weights