Hume AI Open Sources TADA Text-to-Speech Model with Zero Hallucinations

Read the original article →

Hume AI Open Sources TADA Text-to-Speech Model with Zero Hallucinations

Hume AI released TADA (Text Audio Dual Alignment), their first open source text-to-speech model. The key innovation is a tokenization architecture that aligns text tokens directly with audio tokens, eliminating the content hallucinations that plague other TTS systems.

Performance

TADA achieves a real-time factor of 0.09, meaning it generates speech roughly 5x faster than comparable LLM-based TTS systems. It supports up to 700 seconds of audio context and can run on mobile devices.

Models Available

Hume released two variants: tada-1b (1 billion parameters) and tada-3b-ml (3 billion parameters, multilingual). Both models, code, and a research paper are fully open source on GitHub and Hugging Face.

Why It Matters

Open source TTS models have lagged behind proprietary options in quality and reliability. TADA's zero-hallucination approach and fast inference speed make it a practical option for developers building voice applications without relying on paid APIs.

References

Discussion

  • Loading…

← Back to News