all-MiniLM-L6-v2

all-MiniLM-L6-v2 is a lightweight sentence-transformers model that maps sentences and short paragraphs into 384-dimensional dense vectors optimized for semantic tasks. Designed for high-speed, low-resource environments, it supports clustering, semantic search, and similarity comparison. Based on the MiniLM architecture, the model uses a contrastive learning objective trained on over 1 billion sentence pairs from diverse datasets like Reddit, WikiAnswers, and Stack Exchange. It outperforms many larger models in embedding quality relative to size and is available in PyTorch, TensorFlow, ONNX, and other formats. all-MiniLM-L6-v2 can be used with the sentence-transformers library or directly via Hugging Face Transformers with custom pooling. Text longer than 256 tokens is truncated, making it ideal for short-form text processing. Released under the Apache 2.0 license, the model is widely adopted across academic and commercial applications for its balance of performance and efficiency.

Features

Maps sentences to 384-dimensional embeddings
Trained on over 1 billion sentence pairs using contrastive loss
Optimized for semantic similarity, clustering, and retrieval
Extremely lightweight with only 22.7M parameters
Supports multiple frameworks (PyTorch, TensorFlow, ONNX, Rust, OpenVINO)
Compatible with Hugging Face Transformers and sentence-transformers
Truncates input beyond 256 word pieces for efficient processing
Open-source under Apache 2.0, widely used in research and production

Project Samples

Project Activity

See All Activity >

Follow all-MiniLM-L6-v2

all-MiniLM-L6-v2 Web Site

nel_h2

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of all-MiniLM-L6-v2!

Additional Project Details

Registered

2025-06-27

Similar Business Software

ChatGLM

ChatGLM-6B is an open-source, Chinese-English bilingual dialogue language model based on the General Language Model (GLM) architecture with 6.2 billion parameters. Combined with model quantization technology, users can deploy locally on consumer-grade graphics cards (only 6GB of video memory is...

See Software
DeepSeek-Coder-V2

DeepSeek-Coder-V2 is an open source code language model designed to excel in programming and mathematical reasoning tasks. It features a Mixture-of-Experts (MoE) architecture with 236 billion total parameters and 21 billion activated parameters per token, enabling efficient processing and high...

See Software
Cohere

Cohere is an enterprise AI platform that enables developers and businesses to build powerful language-based applications. Specializing in large language models (LLMs), Cohere provides solutions for text generation, summarization, and semantic search. Their model offerings include the Command...

See Software
Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
Grok Code Fast 1

Grok Code Fast 1 is a high-speed, economical reasoning model designed specifically for agentic coding workflows. Unlike traditional models that can feel slow in tool-based loops, it delivers near-instant responses, excelling in everyday software development tasks. Built from scratch with a...

See Software
InstructGPT

InstructGPT is an open-source framework for training language models to generate natural language instructions from visual input. It uses a generative pre-trained transformer (GPT) model and the state-of-the-art object detector, Mask R-CNN, to detect objects in images and generate natural...

See Software