15B-34B Models
100 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.
| Model | Data | VRAM min | VRAM optimal | Best local GPU | Cloud fallback | Detail |
|---|---|---|---|---|---|---|
| CodeLlama 34B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| CodeLlama 34B Q4 | Estimated | 16GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| CodeLlama 34B Q5 | Estimated | 20GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| CodeLlama 34B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 34B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 34B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 34B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 34B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder 33B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder 33B Q4 | Estimated | 16GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| DeepSeek Coder 33B Q5 | Estimated | 20GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| DeepSeek Coder 33B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek-R1 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek-R1 32B Q4 | Measured | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek-R1 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek-R1 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 32B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 32B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 Coder 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 Coder 32B Q4 | Estimated | 16GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 Coder 32B Q5 | Estimated | 20GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 Coder 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 VL 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 VL 32B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 VL 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 VL 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 32B Q4 | Measured | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 32B Q5 | Measured | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 32B CLOUD | Estimated | 20GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 32B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| QwQ 32B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| QwQ 32B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| QwQ 32B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| QwQ 32B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Nemotron 3 Nano 30B FP16 | Measured | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Nemotron 3 Nano 30B Q4 | Measured | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Nemotron 3 Nano 30B Q5 | Measured | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Nemotron 3 Nano 30B Q8 | Measured | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 30B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 30B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 30B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 30B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 Coder 30B CLOUD | Estimated | 20GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 Coder 30B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 Coder 30B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 Coder 30B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 Coder 30B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 30B CLOUD | Estimated | 20GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 30B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 30B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 30B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 VL 30B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 2 27B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 2 27B Q4 | Estimated | 16GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 2 27B Q5 | Estimated | 20GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 2 27B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 27B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 27B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 27B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 27B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Translategemma 27B FP16 | Measured | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Translategemma 27B Q4 | Measured | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Translategemma 27B Q5 | Measured | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Translategemma 27B Q8 | Measured | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Magistral 24B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Magistral 24B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Magistral 24B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Magistral 24B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 24B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 24B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 24B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 24B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 22B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 22B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 22B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Small 22B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| GPT-OSS 20B CLOUD | Estimated | 20GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| GPT-OSS 20B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| GPT-OSS 20B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| GPT-OSS 20B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| GPT-OSS 20B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder V2 16B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder V2 16B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder V2 16B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek Coder V2 16B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| StarCoder2 15B FP16 | Estimated | 30GB | 42GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| StarCoder2 15B Q4 | Estimated | 18GB | 28GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| StarCoder2 15B Q5 | Estimated | 20GB | 30GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| StarCoder2 15B Q8 | Estimated | 24GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
We may earn a commission if you click links on this page.