https://store-images.s-microsoft.com/image/apps.19458.182416c5-cc4f-4d75-ae21-77ea2852a1a8.b5721e98-205f-41b4-ae53-1b5f08e0b328.e9c91b5c-e0ef-4752-91eb-f0eafc2eee22

voyage-3.5 on Azure AI Foundry

by MongoDB, Inc.

Embedding model for general-purpose (incl multilingual) retrieval/search and AI. 32K context length

Text embedding model optimized for general-purpose (including multilingual) retrieval/search and AI applications. 32K context length.

Text embedding model optimized for general-purpose (including multilingual) retrieval/search and AI applications. 32K context length. Throughput varies significantly by workload pattern based on factors like GPU type, model size, sequence length, batch size, and vector dimensionality. Typically we see ~75k~150k tokens/sec for this model on A100 GPUs. We recommend customers benchmark their own throughput and token volume during testing to inform token TCO estimates.

voyage-3.5:
  • Outperforms OpenAI-v3-large, voyage-3, and Cohere-v4 by an average of 8.26%, 2.66%, and 1.63% respectively across domains
  • Supports embeddings of 2048, 1024, 512, and 256 dimensions
  • Offers multiple quantization formats including float, int8, uint8, and binary variants
  • Maintains a 32K-token context length at the same price point as voyage-3
  • Reduces vector database costs by up to 83% (int8, 2048) or 99% (binary, 1024) compared to OpenAI-v3-large

At a glance

https://store-images.s-microsoft.com/image/apps.20828.182416c5-cc4f-4d75-ae21-77ea2852a1a8.b5721e98-205f-41b4-ae53-1b5f08e0b328.7e9e6d73-0736-4dae-a204-42c2885216fb