15B-34B Models

100 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.

Model Data VRAM min VRAM optimal Best local GPU Cloud fallback Detail
CodeLlama 34B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
CodeLlama 34B Q4 Estimated 16GB 20GB RTX 3090 24GB A6000 48GB Open
CodeLlama 34B Q5 Estimated 20GB 22GB RTX 3090 24GB A6000 48GB Open
CodeLlama 34B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
LLaVA 34B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
LLaVA 34B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
LLaVA 34B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
LLaVA 34B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder 33B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder 33B Q4 Estimated 16GB 20GB RTX 3090 24GB A6000 48GB Open
DeepSeek Coder 33B Q5 Estimated 20GB 22GB RTX 3090 24GB A6000 48GB Open
DeepSeek Coder 33B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek-R1 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek-R1 32B Q4 Measured 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek-R1 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek-R1 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen 32B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 32B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 Coder 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 Coder 32B Q4 Estimated 16GB 20GB RTX 3090 24GB A6000 48GB Open
Qwen2.5 Coder 32B Q5 Estimated 20GB 22GB RTX 3090 24GB A6000 48GB Open
Qwen2.5 Coder 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 VL 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 VL 32B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 VL 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen2.5 VL 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 32B Q4 Measured 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 32B Q5 Measured 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 32B CLOUD Estimated 20GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 32B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
QwQ 32B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
QwQ 32B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
QwQ 32B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
QwQ 32B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Nemotron 3 Nano 30B FP16 Measured 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Nemotron 3 Nano 30B Q4 Measured 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Nemotron 3 Nano 30B Q5 Measured 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Nemotron 3 Nano 30B Q8 Measured 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 30B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 30B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 30B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 30B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 Coder 30B CLOUD Estimated 20GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 Coder 30B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 Coder 30B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 Coder 30B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 Coder 30B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 30B CLOUD Estimated 20GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 30B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 30B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 30B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Qwen3 VL 30B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 2 27B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 2 27B Q4 Estimated 16GB 20GB RTX 3090 24GB A6000 48GB Open
Gemma 2 27B Q5 Estimated 20GB 22GB RTX 3090 24GB A6000 48GB Open
Gemma 2 27B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 3 27B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 3 27B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 3 27B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Gemma 3 27B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Translategemma 27B FP16 Measured 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Translategemma 27B Q4 Measured 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Translategemma 27B Q5 Measured 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Translategemma 27B Q8 Measured 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Magistral 24B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Magistral 24B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Magistral 24B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Magistral 24B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 24B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 24B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 24B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 24B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 22B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 22B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 22B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
Mistral Small 22B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
GPT-OSS 20B CLOUD Estimated 20GB 28GB RTX 6000 Ada 48GB A100 80GB Open
GPT-OSS 20B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
GPT-OSS 20B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
GPT-OSS 20B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
GPT-OSS 20B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder V2 16B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder V2 16B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder V2 16B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
DeepSeek Coder V2 16B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
StarCoder2 15B FP16 Estimated 30GB 42GB RTX 6000 Ada 48GB A100 80GB Open
StarCoder2 15B Q4 Estimated 18GB 28GB RTX 6000 Ada 48GB A100 80GB Open
StarCoder2 15B Q5 Estimated 20GB 30GB RTX 6000 Ada 48GB A100 80GB Open
StarCoder2 15B Q8 Estimated 24GB 34GB RTX 6000 Ada 48GB A100 80GB Open
Back to all groups Use VRAM calculator Run large models on RunPod Try Vast.ai fallback

We may earn a commission if you click links on this page.