Skip to the content.


This is yet another attempt of maintaining a list of datasets directly related to MIR. Other lists that I have found are the ISMIR page, this web page, and this web page. If you are interested in speech processing, you can find a table of speech datasets on this page. If you are interested in multi-tracks, the Open Multitrack Testbed should be a good starting point. UPF also has an excellent page with datasets for world-music, including Indian art music, Turkish Makam music, and Beijing Opera. A curated list of MIDI sources can be found [here]( Two additional general resources are for MIDI files and for audio files.

If you know of other data sets that should be included in this list please create an issue/pull request or just send me a note.

get the book

Book Cover Image: An Introduction to Audio Content Analysis @ IEEE

get the code


get the slides

pdf (latex)

get the plots


get in touch

alexander lerch

support this resource

Buy Me A Coffee