Audio & Speech
AI tools for text-to-speech, music generation, speech recognition, and audio processing.
7 open-source tools in this category
Whisper (OpenAI)
78kGeneral-purpose speech recognition. Transcribes audio to text in 100+ languages.
TTS (Coqui)
45.3kDeep learning text-to-speech toolkit. 1100+ languages, multiple voices, fine-tunable models.
Whisper.cpp
40.5kHigh-performance speech-to-text in C/C++. Runs Whisper models locally on CPU efficiently.
Bark (Suno)
39.1kText-prompted generative audio model. Generates speech, music, sound effects from text.
AudioCraft
23.3kMeta's audio generation library. MusicGen + AudioGen for music and sound effect generation.
Riffusion
3.9kStable diffusion for real-time music generation. Generates music from text prompts via spectrograms.
Scaper
1.3kSoundscape generation and synthesis. Mixes sound events to create realistic audio scenes.