CSM-1B (Conversational Speech Model) is a text-to-speech model developed by Sesame, designed to generate natural-sounding audio using text and audio prompts. Built on a LLaMA-based architecture and paired with a lightweight Mimi audio decoder, CSM-1B produces RVQ audio codes for realistic voice synthesis. It supports both single-sentence audio generation and full conversational modeling with contextual audio and text input. While not fine-tuned to mimic specific voices, it can create a wide range of synthetic speaker identities. It runs natively on Hugging Face Transformers (v4.52.1+) and supports batched inference, CUDA graph compilation, and fine-tuning with the standard Transformers Trainer. Though optimized for English, it has limited multilingual capabilities due to data overlap. CSM-1B is released under the Apache-2.0 license and includes strict ethical use guidelines prohibiting impersonation, misinformation, and other forms of misuse.

Features

  • Text-to-speech generation using RVQ audio code output
  • LLaMA-based model with Mimi audio decoder
  • Supports full conversational input with contextual audio
  • Batched inference and CUDA graph support for efficiency
  • Fine-tuning available via Transformers’ Trainer API
  • Native support in Hugging Face Transformers (v4.52.1+)
  • Open-ended voice generation without predefined speakers
  • Ethical use policy to prevent impersonation and misuse

Project Samples

Project Activity

See All Activity >

Categories

AI Models

Follow csm-1b

csm-1b Web Site

Other Useful Business Software
AI-powered service management for IT and enterprise teams Icon
AI-powered service management for IT and enterprise teams

Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
Try it Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of csm-1b!

Additional Project Details

Registered

2025-06-27