Kimi K2.5 is Moonshot AI’s open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed vision and text tokens. Based on a 1T-parameter Mixture-of-Experts (MoE) architecture with 32B activated parameters, it integrates advanced language reasoning with strong visual understanding. K2.5 supports both “Thinking” and “Instant” modes, enabling either deep step-by-step reasoning or low-latency responses depending on the task. Designed for agentic workflows, it features an Agent Swarm mechanism that decomposes complex problems into coordinated sub-agents executing in parallel. With a 256K context length and MoonViT vision encoder, the model excels across reasoning, coding, long-context comprehension, image, and video benchmarks. Kimi K2.5 is available via Moonshot’s API (OpenAI/Anthropic-compatible) and supports deployment through vLLM, SGLang, and KTransformers.
Features
- 🧠 Native Multimodality: Seamlessly processes text, images, and video with strong cross-modal reasoning and visual grounding.
- ⚡ Thinking & Instant Modes: Switch between deep reasoning mode and low-latency response mode for flexible performance.
- 🐝 Agent Swarm Architecture: Decomposes complex tasks into coordinated, domain-specific sub-agents for parallel execution.
- 💻 Vision-Enhanced Coding: Generates code from UI designs, visual specs, and video workflows with tool orchestration support.
- 📏 256K Long Context Window: Handles extensive documents and multi-step reasoning across large inputs.
- 🚀 Flexible Deployment & Quantization: Supports native INT4 quantization and runs on vLLM, SGLang, or KTransformers with OpenAI-compatible APIs.