Alexa, Google Gemini, Amazon Polly, Google Cloud Speech API, and Aerosolve are the most popular tools in the category “Voice & Audio Models”.
Google's family of large language models
Use generative AI to create music and sound effects
High-quality multi-lingual text-to-speech library
Foundational model for human-like, expressive TTS
A large language model for zero-shot video generation (By Google)
Your personalized ChatGPT auto trained on your website and private data
Real-time transcription tool with AI-powered suggestions
Online AI voice cloning software (By ElevenLabs)
AI notetaker to transcribe, summarize, analyze meetings
On-device speech-to-text engine
A deep learning toolkit for Text-to-Speech, battle-tested in research and production
Open source embedded speech-to-text engine
A data augmentations library for audio, image, text, and video (By Facebook)