150
Coding profiles in catalog
This guide prioritizes practical coding models you can run today with clear VRAM requirements. Ranking favors measured profiles first, then stable local fit under 24GB VRAM.
Coding profiles in catalog
Measured coding profiles
Local-first picks (<=24GB optimal)
Heavy picks (cloud-first)
| Model | VRAM min/optimal | 3090 tok/s | Data | Detail |
|---|---|---|---|---|
| Qwen3 8B Q4 | 6GB / 16GB | 30 | Measured | Open |
| Qwen3 8B Q5 | 8GB / 18GB | 27 | Measured | Open |
| Qwen2.5 Coder 32B Q4 | 16GB / 20GB | 11 | Measured | Open |
| Qwen2.5 14B Q4 | 10GB / 20GB | 21 | Measured | Open |
| Qwen2.5 Coder 32B Q5 | 20GB / 22GB | 9.9 | Measured | Open |
| Qwen2.5 14B Q5 | 12GB / 22GB | 18.9 | Measured | Open |
| Qwen3 8B Q8 | 12GB / 22GB | 21.6 | Measured | Open |
| CodeLlama 7B Q4 | 8GB / 10GB | 30 | Estimated | Open |
| Qwen2.5 0.5B Q4 | 2GB / 10GB | 48 | Estimated | Open |
| Qwen2.5 Coder 0.5B Q4 | 2GB / 10GB | 48 | Estimated | Open |
| CodeLlama 7B Q5 | 10GB / 12GB | 27 | Estimated | Open |
| CodeGemma 2B Q4 | 2GB / 12GB | 42 | Estimated | Open |
| Model | VRAM min/optimal | Cloud fallback | Detail |
|---|---|---|---|
| Qwen2.5 14B Q8 | 16GB / 26GB | A100 80GB | Open |
| Qwen3 Coder 30B Q4 | 18GB / 28GB | A100 80GB | Open |
| Qwen3 Coder 30B CLOUD | 20GB / 28GB | A100 80GB | Open |
| Qwen3 Coder 30B Q5 | 20GB / 30GB | A100 80GB | Open |
| Qwen3 8B FP16 | 18GB / 30GB | A100 80GB | Open |
| Qwen2.5 Coder 32B Q8 | 24GB / 34GB | A100 80GB | Open |
| Qwen3 Coder 30B Q8 | 24GB / 34GB | A100 80GB | Open |
| Qwen2.5 14B FP16 | 22GB / 34GB | A100 80GB | Open |