What is Kaldi?
It is a state-of-the-art automatic speech recognition toolkit. It is intended for use by speech recognition researchers and professionals.
Kaldi is a tool in the Speech Recognition Tools category of a tech stack.
Kaldi is an open source tool with 12.3K GitHub stars and 5.1K GitHub forks. Here’s a link to Kaldi's open source repository on GitHub
Who uses Kaldi?
6 companies reportedly use Kaldi in their tech stacks, including Labs, Voicebridge, and Uhlive.
13 developers on StackShare have stated that they use Kaldi.
Kaldi Alternatives & Comparisons
What are some alternatives to Kaldi?
See all alternatives
Botium Speech Processing
It is a unified, developer-friendly API to the best available Speech-To-Text and Text-To-Speech services.
It is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
It is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
It can be used to complement any regular touch user interface with a real time voice user interface. It offers real time feedback for faster and more intuitive experience that enables end user to recover from possible errors quickly and with no interruptions.
wav2letter++ is a fast open source speech processing toolkit from the Speech Team at Facebook AI Research. It is written entirely in C++ and uses the ArrayFire tensor library and the flashlight machine learning library for maximum efficiency. Our approach is detailed in this arXiv paper.