Building Best of 2011

Monday, 16 January 2012
by omar
filed under Announcements and Trends and Data
Comments: 6

Earlier this week we released our Best of 2011 charts. 2011 saw you spend over 71 thousand years listening to music and scrobble more than 11 billion tracks. We’ve been churning through all of this data to find out what truly defined 2011.

New for this year is the discoveries chart. We went back to the beginning of time (well, to 2003) and checked every one of your 61 billion scrobbles to work out which artists were first scrobbled in 2011.

We’ve also broken these charts down by country and tag. Whatever you’re interested in, from experimental music in Mexico, the latest innovations in Finnish pop, or just what’s Big in Japan, you now have a means to browse them.

Following on from last year we are providing you with a data download. Musicbrainz IDs are now included in this data (where we have them) as part of our continued collaboration with Musicbrainz.

Producing the ‘Best of’ Charts is a very different process to our usual weekly charts. What follows is an overview of the process. In particular I’ll explain how we determined the new albums and discoveries of 2011, and how we turned these into the charts you see on the site.

New Albums

Our top artists are calculated based on albums released in 2011. One issue with albums is that they are typically released many times in many locations. To get around this we used a new version of the Musicbrainz database to find track listings for albums that were first released in 2011.

Of course, that isn’t the end of the story. Our library doesn’t always match up with Musicbrainz. Such issues need to be handled when we align album information from Musicbrainz with our own scrobble data. It’s one of the reasons we’re improving our Musicbrainz ID coverage .

New Discoveries

We label an artist as a new discovery if they were first scrobbled in 2011. As I mentioned previously, this can only be decided by checking through all of the scrobbles we have ever received.

This task is complicated by misspelled artist names, collaborations, and remixes. A nice example is Britney Spears’ collaboration with Sabi. Britney is certainly not a new discovery, even though this incorrectly-titled artist was first scrobbled in 2011. We avoid this by mapping artist names to their correct versions, before sorting through their scrobbles.

Our Human Computer

Our final step was to send the charts to our secret weapon: the music team. They pored through thousands of the top artists of 2011, matching them against their own databases and removing/adding artists that were incorrect or missing.

Data Download

This year we have two data downloads: the first – like last year’s – contains the top artists and albums of 2011; the second contains only the top artists, because they do not all have associated albums. In the data you’ll find all of the artists and albums from Best of 2011, along with play and listener counts, top tags, and image links.

In both cases we have added Musicbrainz IDs to the data. You can use these on our own API, BBC Music, and The Guardian. Use the data as you please; we look forward to seeing what you come up with!