Download 736 740 Zip -
Reference the original paper: Drossos, K., Lipping, S., & Virtanen, T. (2020). "Clotho: an Audio Captioning Dataset." Proc. IEEE ICASSP, pp. 736-740 .
Categorized into development, validation, and evaluation sets for training and testing machine learning models. 📥 How to Download Download 736 740 zip
Explain that the goal is "Automated Audio Captioning" (AAC)—predicting a textual description from an audio signal. Reference the original paper: Drossos, K
Clotho is an audio dataset used for intermodal translation (audio-to-text) tasks. It is widely utilized in the (Detection and Classification of Acoustic Scenes and Events) challenges. 📂 Key Data Components Reference the original paper: Drossos
You can also download specific evaluation (1.2 GB) or analysis (14.4 GB) subsets. 🛠️ Producing a Write-up
Thousands of sound samples ranging from 15 to 30 seconds.
