8GB VRAM Models for Ollama

Entry tier for lightweight chat and coding models.

Recommended models

Model Fit Expected tok/s
Llama 3.1 8B Q4 good 20-35
Qwen 2.5 7B Q4 good 22-38
Check your fit Hardware upgrade Cloud GPU fallback