Industrial-grade open-source speech recognition toolkit from Alibaba — 170x realtime, 50+ languages, OpenAI-compatible API.
Industrial-grade open-source speech recognition toolkit from Alibaba — 170x realtime, 50+ languages, OpenAI-compatible API.
FunASR is the open-source speech toolkit from Alibaba's ModelScope team and one of the most production-credible alternatives to OpenAI Whisper in 2026. It bundles a family of in-house models — Paraformer for non-autoregressive ASR, SenseVoice for multilingual recognition with emotion and event detection, CAM++ for speaker verification, and FSMN-VAD for voice activity detection — into a single toolkit with a unified Python API and a self-hostable HTTP server. Headline numbers are aggressive: 170x realtime decoding on a modern GPU, 50+ languages, robust performance on Chinese and other Asian languages where Whisper has historically struggled, and built-in speaker diarisation, timestamping, punctuation and streaming. The server speaks an OpenAI-compatible transcription API, so teams can swap it in behind existing Whisper integrations with no client changes. FunASR has become the default ASR backbone for many Chinese-language voice agent stacks and is increasingly used worldwide by teams who want on-prem speech without paying per-minute cloud rates. It is Apache-licensed, ships pre-built Docker images for CPU and GPU inference, and integrates cleanly with the ModelScope hub for newer model releases.
Was this helpful?
Feature information is available on the official website.
View Features →$0
Ready to get started with FunASR?
View Pricing Options →Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
No reviews yet. Be the first to share your experience!
Get started with FunASR and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →