Embedding Models

10 profiles in this group. Use this hub page to compare practical VRAM floor, expected throughput, and best local-vs-cloud path.

Model	Data	VRAM min	VRAM optimal	Best local GPU	Cloud fallback	Detail
BGE-M3 567M FP16	Estimated	4GB	12GB	RTX 3090 24GB	A6000 48GB	Open
MXBAI Embed Large 335M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Snowflake Arctic Embed 335M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Nomic Embed Text 137M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Snowflake Arctic Embed 137M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Snowflake Arctic Embed 110M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
All-MiniLM 33M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Snowflake Arctic Embed 33M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
All-MiniLM 22M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open
Snowflake Arctic Embed 22M FP16	Estimated	2GB	10GB	RTX 3090 24GB	A6000 48GB	Open

We may earn a commission if you click links on this page.