Open-source real-time conversational AI by Kyutai — a full-duplex voice assistant that can listen and speak simultaneously with low latency.
Open-source real-time conversational AI by Kyutai — a full-duplex voice assistant that can listen and speak simultaneously with low latency.
Moshi is an open-source, real-time conversational AI developed by Kyutai, a French AI research lab. Moshi represents a significant advancement in voice AI: it's a full-duplex speech model that can listen and speak simultaneously, enabling natural back-and-forth conversation without the awkward turn-taking delays typical of current voice assistants.
The technical achievement is notable. Traditional voice AI follows a pipeline of speech-to-text, then language model processing, then text-to-speech — each step adding latency. Moshi uses a single end-to-end model that processes audio directly, reducing response latency to approximately 200 milliseconds. This near-instantaneous response creates conversations that feel genuinely natural, with the ability to interrupt, overlap, and react in real time just like human conversation.
Moshi supports expressive speech with emotional nuance, different speaking styles, and natural prosody. It can adjust its tone based on context — being empathetic when appropriate, enthusiastic when relevant, or calm and measured for technical discussions. This emotional intelligence in voice interaction is a step beyond the flat, monotone output of most voice assistants.
Being fully open-source (released under Apache 2.0), Moshi can be self-hosted and customized. Developers can fine-tune the model for specific use cases, run it on their own infrastructure for privacy, and integrate it into applications. Kyutai also provides a web demo for trying Moshi without any setup.
The model is built on Kyutai's Helium 7B architecture and was trained on a diverse corpus of conversational audio data including over 100,000 synthetic dialogues. It supports multiple languages and can be extended to additional languages through fine-tuning. For developers building voice-first applications — customer service bots, voice assistants, interactive characters, or accessibility tools — Moshi provides an open, high-quality foundation that rivals proprietary alternatives.
Was this helpful?
Feature information is available on the official website.
View Features →Free
Usage-based
Ready to get started with Moshi?
View Pricing Options →Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
No reviews yet. Be the first to share your experience!
Get started with Moshi and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →