We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity. In addition, we find VALL-E could preserve the speaker's emotion and acoustic environment of the acoustic prompt in synthesis.

Features

  • The pipeline of VALL-E is phoneme → discrete code → waveform
  • VALL-E generates the discrete audio codec codes based on phoneme and acoustic code prompts
  • VALL-E directly enables various speech synthesis applications
  • Zero-shot TTS, speech editing, and content creation
  • Combined with other generative AI models like GPT-3
  • VALL-E can synthesize personalized speech while maintaining the acoustic environment of the speaker prompt

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow VALL-E

VALL-E Web Site

Other Useful Business Software
Run Any Workload on Compute Engine VMs Icon
Run Any Workload on Compute Engine VMs

From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
Try Compute Engine
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VALL-E!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM), Python Generative AI, Python AI Models

Registered

2023-03-22