Skip to the content.


This is yet another attempt of maintaining a list of datasets directly related to MIR. Other lists that I have found are the ISMIR page, this web page, and this web page. If you are interested in speech processing, you can find a table of speech datasets on this page. If you are interested in multi-tracks, the Open Multitrack Testbed should be a good starting point. UPF also has an excellent page with datasets for world-music, including Indian art music, Turkish Makam music, and Beijing Opera. A curated list of MIDI sources can be found here. Two additional general resources are for MIDI files and for audio files.

If you know of other data sets that should be included in this list please create an issue/pull request or just send me a note.

get the book

Book Cover Image: An Introduction to Audio Content Analysis @ IEEE
@ Wiley

get the code


get the slides

pdf (latex)

get the plots


get in touch

alexander lerch

support this resource

Buy Me A Coffee