Verified on RTX 3090
This page tracks model tags with measured local runs on RTX 3090. Use it when you want evidence-backed local performance instead of pure baseline estimates.
Measured model tags
| Model | Tag | Tokens/s | Latency (ms) | Test time | Detail |
|---|---|---|---|---|---|
| Qwen3 Coder 30B CLOUD | qwen3-coder:30b | 153.4 | 961 | 2026-04-01T11:53:50Z | Open |
| Qwen3 8B FP16 | qwen3:8b | 125.7 | 1554 | 2026-04-01T11:53:50Z | Open |
| Ministral 3 14B FP16 | ministral-3:14b | 82.7 | 2390 | 2026-04-01T11:53:50Z | Open |
| DeepSeek-R1 14B FP16 | deepseek-r1:14b | 80.2 | 2027 | 2026-04-01T11:53:50Z | Open |
| Qwen2.5 14B FP16 | qwen2.5:14b | 77.7 | 1072 | 2026-04-01T11:53:50Z | Open |
| Nemotron 3 Nano 30B FP16 | nemotron-3-nano:30b | 57.0 | 2468 | 2026-04-01T11:53:50Z | Open |
| Translategemma 27B FP16 | translategemma:27b | 41.3 | 3142 | 2026-04-01T11:53:50Z | Open |
| Qwen2.5 Coder 32B FP16 | qwen2.5-coder:32b | 37.9 | 1521 | 2026-04-01T11:53:50Z | Open |
| Qwen3.5 35B FP16 | qwen3.5:35b | 35.1 | 3585 | 2026-03-15T12:17:40Z | Open |
| GPT-OSS 20B CLOUD | gpt-oss:20b | 29.6 | 4118 | 2026-04-01T11:53:50Z | Open |
| Mistral Small 22B FP16 | mistral-small:22b | 16.7 | 5609 | 2026-04-01T11:53:50Z | Open |
| Glm 4.7 Flash 7B FP16 | glm-4.7-flash:bf16 | 11.2 | 9291 | 2026-03-04T09:01:38Z | Open |
| Gemma 3 27B FP16 | gemma3:27b | 9.1 | 11605 | 2026-04-01T11:53:50Z | Open |
| Llama 4 16X17B FP16 | llama4:16x17b | 7.6 | 9383 | 2026-04-01T11:53:50Z | Open |
| QwQ 32B FP16 | qwq:32b | 6.6 | 15250 | 2026-04-01T11:53:50Z | Open |
| Qwen3.5 122B FP16 | qwen3.5:122b | 4.9 | 11915 | 2026-02-26T19:19:16Z | Open |
| Llama 3.3 70B FP16 | llama3.3:70b | 3.8 | 14959 | 2026-03-11T04:17:51Z | Open |
Validation notes
- Measured rows come from actual benchmark snapshots, not template placeholders.
- Catalog baseline values stay visible on model pages for comparison.
- For heavy profiles, use cloud fallback links after local saturation.