The MetaBrainz core mission is to curate and maintain public datasets that anyone can download and use. We ask commercial supporters to support us in order to help fund the creation and maintenance of these datasets. Personal use of our datasets will always be free. We appreciate end user donations as well!
Our datasets fit into two main categories: Main project data dumps, containing the entirety of the data for a given project, and derived data dumps that are based on the data in our main project databases and have a more specific purpose.
This data dump includes all public portions of the data from our MusicBrainz project. All artists, releases, recordings, labels and the relationships between them, and much more data, including everything needed to run your own copy of MusicBrainz is included.
Derived dumps take data from one project and transform it into a new dataset that solves a different problem.
These dumps contain canonical MusicBrainz data, which makes it easier to reason about the core metadata in MusicBrainz, providing one single record for each musical recording and release in the database. This dataset is useful for matching data to or from MusicBrainz.
The Music Listening Histories Dataset collects a large number of music listening events assembled from more than 27 billion time-stamped logs extracted from Last.fm. Using the MusicBrainz canonical data, we cleaned up errors in the data in order to provide an improved version of this dataset.