8GB VRAM
Entry tier for lightweight chat and coding models.
View recommended modelsEntry tier for lightweight chat and coding models.
View recommended modelsBalanced tier for 7B/13B with practical local runs.
View recommended modelsComfort zone for 13B/14B and selected 32B Q4 setups.
View recommended modelsHigh-value tier for large quantized models and long context tests.
View recommended modelsUnified memory planning guide for local LLM inference on M-series devices.
Open Apple Silicon guideMeasured model tags and sustained-load benchmark snapshots on 3090.
Open verification page