Embedding Models

10 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.

Model Data VRAM min VRAM optimal Best local GPU Cloud fallback Detail
BGE-M3 567M FP16 Estimated 4GB 12GB RTX 3090 24GB A6000 48GB Open
MXBAI Embed Large 335M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Snowflake Arctic Embed 335M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Nomic Embed Text 137M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Snowflake Arctic Embed 137M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Snowflake Arctic Embed 110M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
All-MiniLM 33M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Snowflake Arctic Embed 33M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
All-MiniLM 22M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Snowflake Arctic Embed 22M FP16 Estimated 2GB 10GB RTX 3090 24GB A6000 48GB Open
Back to all groups Use VRAM calculator Run large models on RunPod Try Vast.ai fallback

We may earn a commission if you click links on this page.