73B-250B Models
34 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.
| Model | Data | VRAM min | VRAM optimal | Best local GPU | Cloud fallback | Detail |
|---|---|---|---|---|---|---|
| DeepSeek Coder V2 236B FP16 | Estimated | 150GB | 162GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek Coder V2 236B Q4 | Estimated | 138GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek Coder V2 236B Q5 | Estimated | 140GB | 150GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| DeepSeek Coder V2 236B Q8 | Estimated | 144GB | 154GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 235B FP16 | Estimated | 150GB | 162GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 235B Q4 | Estimated | 138GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 235B Q5 | Estimated | 140GB | 150GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 235B Q8 | Estimated | 144GB | 154GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 VL 235B CLOUD | Estimated | 140GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 VL 235B FP16 | Estimated | 150GB | 162GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 VL 235B Q4 | Estimated | 138GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 VL 235B Q5 | Estimated | 140GB | 150GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3 VL 235B Q8 | Estimated | 144GB | 154GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Mixtral 8X22B FP16 | Estimated | 150GB | 162GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Mixtral 8X22B Q4 | Estimated | 138GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Mixtral 8X22B Q5 | Estimated | 140GB | 150GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Mixtral 8X22B Q8 | Estimated | 144GB | 154GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3.5 122B FP16 | Estimated | 150GB | 162GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3.5 122B Q4 | Estimated | 138GB | 148GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3.5 122B Q5 | Estimated | 140GB | 150GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen3.5 122B Q8 | Estimated | 144GB | 154GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| GPT-OSS 120B CLOUD | Estimated | 70GB | 78GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| GPT-OSS 120B FP16 | Estimated | 80GB | 92GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| GPT-OSS 120B Q4 | Estimated | 68GB | 78GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| GPT-OSS 120B Q5 | Estimated | 70GB | 80GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| GPT-OSS 120B Q8 | Estimated | 74GB | 84GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen 110B FP16 | Estimated | 80GB | 92GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Qwen 110B Q4 | Estimated | 68GB | 78GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| Qwen 110B Q5 | Estimated | 70GB | 80GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| Qwen 110B Q8 | Estimated | 74GB | 84GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.2 Vision 90B FP16 | Estimated | 80GB | 92GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
| Llama 3.2 Vision 90B Q4 | Estimated | 68GB | 78GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| Llama 3.2 Vision 90B Q5 | Estimated | 70GB | 80GB | Dual RTX 4090 (model parallel) | A100 80GB | Open |
| Llama 3.2 Vision 90B Q8 | Estimated | 74GB | 84GB | Cloud-first (no practical single-GPU local) | H100/H200 class | Open |
We may earn a commission if you click links on this page.