It is Stability AI’s first product for music and sound effect generation. Users can create original audio by entering a text prompt and a duration, generating audio in high-quality, 44.1 kHz stereo.
Stable Audio is a tool in the Voice & Audio Models category of a tech stack.
No pros listed yet.
No cons listed yet.
What are some alternatives to Stable Audio?
It is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
It is a 1.2B parameter base model trained on 100K hours of speech for TTS (text-to-speech). It empowers developers and businesses to better connect with their audiences at scale.
It is a transformer-based text-to-audio model. It can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.