250B+ Models
26 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.
| Model | Data | VRAM min | VRAM optimal | Best local GPU | Cloud fallback | Detail |
|---|---|---|---|---|---|---|
| Llama 4 128X17B FP16 | Estimated | 430GB | 442GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 128X17B Q4 | Estimated | 418GB | 428GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 128X17B Q5 | Estimated | 420GB | 430GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 128X17B Q8 | Estimated | 424GB | 434GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-R1 671B FP16 | Estimated | 430GB | 442GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-R1 671B Q4 | Estimated | 418GB | 428GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-R1 671B Q5 | Estimated | 420GB | 430GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-R1 671B Q8 | Estimated | 424GB | 434GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-V3 671B FP16 | Estimated | 430GB | 442GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-V3 671B Q4 | Estimated | 418GB | 428GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-V3 671B Q5 | Estimated | 420GB | 430GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek-V3 671B Q8 | Estimated | 424GB | 434GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 Coder 480B CLOUD | Estimated | 215GB | 223GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 Coder 480B FP16 | Estimated | 225GB | 237GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 Coder 480B Q4 | Estimated | 213GB | 223GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 Coder 480B Q5 | Estimated | 215GB | 225GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 Coder 480B Q8 | Estimated | 219GB | 229GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.1 405B FP16 | Estimated | 225GB | 237GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.1 405B Q4 | Estimated | 213GB | 223GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.1 405B Q5 | Estimated | 215GB | 225GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.1 405B Q8 | Estimated | 219GB | 229GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3.5 397B-A17B CLOUD | Estimated | 215GB | 223GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 16X17B FP16 | Estimated | 225GB | 237GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 16X17B Q4 | Estimated | 213GB | 223GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 16X17B Q5 | Estimated | 215GB | 225GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 4 16X17B Q8 | Estimated | 219GB | 229GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
We may earn a commission if you click links on this page.