https://store-images.s-microsoft.com/image/apps.53494.a02627be-ea73-4f1f-bc3c-7babfcf7d6b8.415d4ce3-2281-4610-9049-c9998e55a14f.9a074969-9e01-4be6-8ca0-cb873e31cdbb
voyage-multimodal-3.5 Embedding Model
by MongoDB, Inc.
Just a moment, logging you in...
Multimodal embedding model that can vectorize interleaved text, images, & video. 32K context length.
Multimodal embedding models are neural networks that transform multiple modalities, such as text and images, into numerical vectors. They are a crucial building block for semantic search/retrieval systems and retrieval-augmented generation (RAG) and are responsible for the retrieval quality.
voyage-multimodal-3.5 is a state-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more. Enabled by Matryoshka learning and quantization-aware training, voyage-multimodal-3.5 supports embeddings in 2048, 1024, 512, and 256 dimensions, with multiple quantization options. Learn more about voyage-multimodal-3.5 here.
voyage-multimodal-3.5 is a state-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more. Enabled by Matryoshka learning and quantization-aware training, voyage-multimodal-3.5 supports embeddings in 2048, 1024, 512, and 256 dimensions, with multiple quantization options. Learn more about voyage-multimodal-3.5 here.