9B-14B Models
72 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.
| Model | Data | VRAM min | VRAM optimal | Best local GPU | Cloud fallback | Detail |
|---|---|---|---|---|---|---|
| DeepSeek-R1 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| DeepSeek-R1 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| DeepSeek-R1 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| DeepSeek-R1 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Ministral 3 14B FP16 | Measured | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Ministral 3 14B Q4 | Measured | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Ministral 3 14B Q5 | Measured | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Ministral 3 14B Q8 | Measured | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-3 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-3 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-3 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-3 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-4 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-4 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-4 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-4 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-4 Reasoning 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Phi-4 Reasoning 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-4 Reasoning 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Phi-4 Reasoning 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 Coder 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen2.5 Coder 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 Coder 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen2.5 Coder 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 14B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Qwen3 14B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen3 14B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Qwen3 14B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| CodeLlama 13B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| CodeLlama 13B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| CodeLlama 13B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| CodeLlama 13B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Llama 2 13B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Llama 2 13B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Llama 2 13B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Llama 2 13B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 13B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| LLaVA 13B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| LLaVA 13B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| LLaVA 13B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| OLMo 2 13B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| OLMo 2 13B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| OLMo 2 13B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| OLMo 2 13B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 12B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 3 12B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 3 12B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 3 12B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Nemo 12B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Mistral Nemo 12B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Mistral Nemo 12B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Mistral Nemo 12B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Llama 3.2 Vision 11B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Llama 3.2 Vision 11B Q4 | Estimated | 12GB | 14GB | RTX 3090 24GB | A6000 48GB | Open |
| Llama 3.2 Vision 11B Q5 | Estimated | 14GB | 16GB | RTX 3090 24GB | A6000 48GB | Open |
| Llama 3.2 Vision 11B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Falcon 3 10B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Falcon 3 10B Q4 | Estimated | 10GB | 20GB | RTX 3090 24GB | A6000 48GB | Open |
| Falcon 3 10B Q5 | Estimated | 12GB | 22GB | RTX 3090 24GB | A6000 48GB | Open |
| Falcon 3 10B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 2 9B FP16 | Estimated | 22GB | 34GB | RTX 6000 Ada 48GB | A100 80GB | Open |
| Gemma 2 9B Q4 | Estimated | 10GB | 12GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 2 9B Q5 | Estimated | 12GB | 14GB | RTX 3090 24GB | A6000 48GB | Open |
| Gemma 2 9B Q8 | Estimated | 16GB | 26GB | RTX 6000 Ada 48GB | A100 80GB | Open |
We may earn a commission if you click links on this page.