E5 Text EmbeddingsMicrosoft
|
voyage-code-3Voyage AI
|
|||||
Related Products
|
||||||
About
E5 Text Embeddings, developed by Microsoft, are advanced models designed to convert textual data into meaningful vector representations, enhancing tasks like semantic search and information retrieval. These models are trained using weakly-supervised contrastive learning on a vast dataset of over one billion text pairs, enabling them to capture intricate semantic relationships across multiple languages. The E5 family includes models of varying sizes—small, base, and large—offering a balance between computational efficiency and embedding quality. Additionally, multilingual versions of these models have been fine-tuned to support diverse languages, ensuring broad applicability in global contexts. Comprehensive evaluations demonstrate that E5 models achieve performance on par with state-of-the-art, English-only models of similar sizes.
|
About
Voyage AI introduces voyage-code-3, a next-generation embedding model optimized for code retrieval. It outperforms OpenAI-v3-large and CodeSage-large by an average of 13.80% and 16.81% on a suite of 32 code retrieval datasets, respectively. It supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization options, including float (32-bit), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8). With a 32 K-token context length, it surpasses OpenAI's 8K and CodeSage Large's 1K context lengths. Voyage-code-3 employs Matryoshka learning to create embeddings with a nested family of various lengths within a single vector. This allows users to vectorize documents into a 2048-dimensional vector and later use shorter versions (e.g., 256, 512, or 1024 dimensions) without re-invoking the embedding model.
|
|||||
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
Platforms Supported
Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook
|
|||||
Audience
E5 Text Embeddings are designed for AI researchers, machine learning engineers, and developers seeking high-quality text representations for applications like semantic search, information retrieval, and multilingual NLP tasks
|
Audience
AI researchers and developers in search of a solution providing an embedding model for code retrieval
|
|||||
Support
Phone Support
24/7 Live Support
Online
|
Support
Phone Support
24/7 Live Support
Online
|
|||||
API
Offers API
|
API
Offers API
|
|||||
Screenshots and VideosNo images available
|
Screenshots and Videos |
|||||
Pricing
Free
Free Version
Free Trial
|
Pricing
No information available.
Free Version
Free Trial
|
|||||
Reviews/
|
Reviews/
|
|||||
Training
Documentation
Webinars
Live Online
In Person
|
Training
Documentation
Webinars
Live Online
In Person
|
|||||
Company InformationMicrosoft
Founded: 1975
United States
github.com/microsoft/unilm/tree/master/e5
|
Company InformationVoyage AI
Founded: 2023
United States
blog.voyageai.com/2024/12/04/voyage-code-3/
|
|||||
Alternatives |
Alternatives |
|||||
|
|
|
|||||
|
|
|
|||||
|
|
|
|||||
|
|
|
|||||
Categories |
Categories |
|||||
Integrations
Elasticsearch
Milvus
Qdrant
Vespa
Weaviate
|
||||||
|
|
|