Eduardo Fonseca

We’re glad to announce the release of FSD50K, the new open dataset of human-labeled sound events. FSD50K contains over 51k Freesound audio clips, totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. To our knowledge, this is the largest fully-open dataset of human-labeled sound events, and modestly the second largest after AudioSet.

FSD50K’s most important characteristics:

FSD50K contains 51,197 audio clips from Freesound, totalling 108.3 hours of multi-labeled audio
The dataset encompasses 200 sound classes hierarchically organized with a subset of the AudioSet Ontology, allowing development and evaluation of large-vocabulary machine listening methods
The audio content is composed mainly of sound events produced by physical sound sources, including human sounds, sounds of things, animals, natural sounds, musical instruments and more
The acoustic material has been manually labeled using the Freesound Annotator platform
Clips are of variable length (0.3 to 30s), and ground truth labels are provided at the clip-level (i.e., weak labels)
All clips are provided as uncompressed PCM 16 bit 44.1 kHz mono audio files
Audio clips are split into a development set (41k clips / 80h, in turn split into train and validation) and an evaluation set (10k clips / 28h)
In addition to audio clips and ground truth, additional metadata is made available (including raw annotations, sound predominance ratings, Freesound metadata, and more), allowing a variety of sound event research tasks
All these resources are licensed under Creative Commons licenses, which allow sharing and reuse

Interested? Go ahead and check out all the resources we’ve just released:

FSD50K dataset: http://doi.org/10.5281/zenodo.4060432
Paper documenting dataset creation, characterization and experiments: Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Xavier Serra. FSD50K: an Open Dataset of Human-Labeled Sound Events, arXiv:2010.00475, 2020
Companion site (where you can explore the audio content of the dataset): https://annotator.freesound.org/fsd/release/FSD50K/
Code for baseline experiments (to be released soon): https://github.com/edufonseca/FSD50K_baseline

Also, we will soon publish a blog post. Stay up-to-date about FSD50K by subscribing to the freesound-annotator Google Group. We hope all these resources are useful for the community! FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra. This effort was kindly sponsored by two Google Faculty Research Awards 2017 and 2018.

datasets (2)

Recent Posts

Links

FSD50K dataset and paper are out!