Dataset
Open on Hugging Facewxl_amh
A focused Amharic speech dataset built around audio and transcription pairs, useful for ASR training, transcription workflows, and speech evaluation.
- 3k rows
- 988 audio files in repo tree
- Speech + transcription
Documentation
- Best public proof point for the voice-data side of the work.
- Useful for ASR training, transcription workflows, and speech evaluation in Amharic.
- Connects directly to the speech-model work in the Shook line.