Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
It is a unified, developer-friendly API to the best available Speech-To-Text and Text-To-Speech services. | Convert text to high-quality AI voice in seconds. Perfect for content creators, businesses, educators and video makers. Fast, affordable and studio-grade output with multiple accents and languages. |
Build voice-enabled chatbot services (for example, IVR systems); Classification of audio file transcriptions; Automated Testing of Voice services with Botium | Indian regional voices, multilingual TTS (Hindi–Marathi–Tamil–Telugu–Kannada + more), natural studio-quality speech, API for developers, voice cloning (on demand), commercial usage rights, fast cloud rendering, pay-as-you-go pricing, dashboard usage analytics, 24/7 customer support, cost-effective |
Statistics | |
GitHub Stars 943 | GitHub Stars - |
GitHub Forks 58 | GitHub Forks - |
Stacks 7 | Stacks 2 |
Followers 21 | Followers 3 |
Votes 0 | Votes 3 |
Integrations | |
| No integrations available | |

It can be used to complement any regular touch user interface with a real time voice user interface. It offers real time feedback for faster and more intuitive experience that enables end user to recover from possible errors quickly and with no interruptions.

Plan, write, and publish books, PDF guides, workbooks, and audiobooks with AI workflows. Customize branding and export instantly.

Transform Text into Natural Speech Clear Speak uses advanced AI to generate human-like voices from text. Experience 27 unique voices with customizable pronunciation.

Transform boring PDFs and text into viral TikTok-style brainrot study videos. Free online tool with AI voices, speed control, and Minecraft backgrounds. 3 free videos daily!

Create viral AI-powered short videos, reels, TikToks, YouTube Shorts, and music videos with voiceovers, auto scripts, subtitles, and ai images — perfect for creators, educators, and marketers.

It is a cloud-based voice service and the brain behind tens of millions of devices including the Echo family of devices, FireTV, Fire Tablet, and third-party devices. You can build voice experiences, or skills, that make everyday tasks faster, easier, and more delightful for customers.

Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.

Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 30 voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in WaveNet and Google’s powerful neural networks to deliver the highest fidelity possible.

It is a state-of-the-art automatic speech recognition toolkit. It is intended for use by speech recognition researchers and professionals.

It is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.