We All Want Love

Thursday, 9 February 2012
by Graham Todman
filed under Trends and Data and Design
Comments: 1

“We don’t have the time for psychological romance –” Larry Blackmon, Cameo

As my missus will testify, I’m not very romantic and greetings cards make me nauseous. So I wasn’t looking forward to designing a feature for Valentine’s Day.

Then I realised it might be interesting to use music data to see if anyone else felt like me or if the world was full of hopeless romantics playing Somebody To Love by Jefferson Airplane back-to-back like saps. So I went to see Omar

Omar the Oracle

I don’t pretend to understand what Omar does.

I like to think his job involves “running things through the computer”. Actually, he works for the Data team at Last.fm. He is always very patient with me, even when I ask stupid questions like: “Do you think David Hasselhoff‘s audience was affected by the drunken cheeseburger vs floor-as-plate incident?” (The Hoff gained an extra 400 scrobbles that week).

Omar was more than happy to dig into the Valentine’s Day stats, especially when I said I wanted to compare “romance” with “sex” (he’s always running the word “sex” through the computer – it never takes long).

To get a clean set of Valentine’s data to analyse, Omar compared the listening behaviour on 14 Feb over a number of years to the behaviour on any other day of the year, thereby sifting out the tracks unique to Valentine’s Day. Then we went to work with the location and genre tags. In his own words:

I had a little look at our tags pages and selected two sets of tags to investigate:

‘Romantic’ Tags: love, love songs, love song, romance, romantic
‘Sexy’ Tags: sexy, sex, erotic

Each city was then given a score based on how many people listened to sexy or romantic tracks on Valentine’s Day, and how many people have tagged these tracks with sexy or romantic tags. This gave us a ‘sexy’ and ‘romantic’ score for every city. Balancing these scores (there was a global bias toward romance) allows us to compare them, and find out which way a city leans: is it more sexy, or more romantic?

Infographic show which cities play the most

Male vs Female Valentine’s Tracks

Usually, if you run a chart for a given day of the year, the same answers keep emerging; Adele, Lady Gaga, Coldplay, or Radiohead. This time Omar tried to find something a little different: how do listening behaviours change on Valentine’s Day? I’ll let him explain again…

To do this I found out how females and males usually listen to tracks, on an average day. This involves counting daily listeners for every track listened to since the start of 2006.

Then I ask exactly the same question, but for Valentine’s days only.

So, our Valentine’s charts show you the tracks which see the largest, most consistent increases in listeners on Valentine’s days. These are the tracks that ladies and gentlemen turn to on Valentine’s Day.

You can see who topped those charts yourself!
If anyone needs me, I’ll be in Fresno.

Building Best of 2011

Monday, 16 January 2012
by Omar Ali
filed under Announcements and Trends and Data
Comments: 6

Earlier this week we released our Best of 2011 charts. 2011 saw you spend over 71 thousand years listening to music and scrobble more than 11 billion tracks. We’ve been churning through all of this data to find out what truly defined 2011.

New for this year is the discoveries chart. We went back to the beginning of time (well, to 2003) and checked every one of your 61 billion scrobbles to work out which artists were first scrobbled in 2011.

We’ve also broken these charts down by country and tag. Whatever you’re interested in, from experimental music in Mexico, the latest innovations in Finnish pop, or just what’s Big in Japan, you now have a means to browse them.

Following on from last year we are providing you with a data download. Musicbrainz IDs are now included in this data (where we have them) as part of our continued collaboration with Musicbrainz.

Producing the ‘Best of’ Charts is a very different process to our usual weekly charts. What follows is an overview of the process. In particular I’ll explain how we determined the new albums and discoveries of 2011, and how we turned these into the charts you see on the site.

New Albums

Our top artists are calculated based on albums released in 2011. One issue with albums is that they are typically released many times in many locations. To get around this we used a new version of the Musicbrainz database to find track listings for albums that were first released in 2011.

Of course, that isn’t the end of the story. Our library doesn’t always match up with Musicbrainz. Such issues need to be handled when we align album information from Musicbrainz with our own scrobble data. It’s one of the reasons we’re improving our Musicbrainz ID coverage .

New Discoveries

We label an artist as a new discovery if they were first scrobbled in 2011. As I mentioned previously, this can only be decided by checking through all of the scrobbles we have ever received.

This task is complicated by misspelled artist names, collaborations, and remixes. A nice example is Britney Spears’ collaboration with Sabi. Britney is certainly not a new discovery, even though this incorrectly-titled artist was first scrobbled in 2011. We avoid this by mapping artist names to their correct versions, before sorting through their scrobbles.

Our Human Computer

Our final step was to send the charts to our secret weapon: the music team. They pored through thousands of the top artists of 2011, matching them against their own databases and removing/adding artists that were incorrect or missing.

Data Download

This year we have two data downloads: the first – like last year’s – contains the top artists and albums of 2011; the second contains only the top artists, because they do not all have associated albums. In the data you’ll find all of the artists and albums from Best of 2011, along with play and listener counts, top tags, and image links.

In both cases we have added Musicbrainz IDs to the data. You can use these on our own API, BBC Music, and The Guardian. Use the data as you please; we look forward to seeing what you come up with!

2011's New Discoveries

Friday, 13 January 2012
by Matt Sheret
filed under Announcements and Trends and Data
Comments: 3

Every year when Best of rolls around, we look at the chart to see if our data could have predicted who’d make it big. While there are a few in there we saw coming * cough * Adele * cough * the reality is that every year things get harder and harder to foresee.

That’s one of the reasons we launched our New Discoveries chart; to show off just how diverse your year in music really is.

Sure, it’s full of credible indie acts; Purity Ring, Death Grips and Work Drugs all did fairly well, while Wugazi – an album of mash-ups between Wu Tang Clan and Fugazi – made it to 13th place after getting huge buzz over the summer.

Someone we might have expected big things from was former Oasis frontman Noel Gallagher. He made it to number three on the New Discoveries chart, but only to 69 on the overall UK chart. That’s not quite as high as we might have expected. Similarly, Gaslight Anthem side project The Horrible Crowes made it to number 12 on the New Discoveries chart, largely off the back of Gaslight Anthem fans trying it out.

Further down the list GLaDOS makes an appearance. The Aperture Science Psychoacoustics Laboratory made it to number 7 on the chart after Valve released several albums worth of material from Portal 2. Soundtracks often jump to the top of the Hype Chart after hardcore fans flock to new releases, and while none of the artists on Drive were eligible for the New Discoveries chart they all got a huge boost when that came out.

Up until the last minute it looked as if the New Discoveries chart would be topped by none other than Rebecca Black. The “Friday” singer was number one on the chart right up until December, but while her video has collected some 17.5 million views on YouTube Last.fm’s music community only played the song 320,000 times between them.

Our first New Discoveries list is actually topped by Youth Lagoon, the project of Boise, ID native Trevor Powers. His dream-like album shot up the Hype Chart in autumn, and appeared to become a fixture throughout the winter for many listeners. He also creeps into the US overall top chart at 100.

For a taster of what these artists have to offer, listen to our New Discoveries playlist on the recently launched Discover app.

In case you missed it yesterday then our design team played with an early cut of our New Discoveries chart to create this neat little poster as a bit of a bonus. Don’t forget that you can also filter the chart to find the New Discoveries that best reflect your tastes using the Country and Tag options.

Here’s to another unpredictable year in music!

Best of 2011 is here!

Thursday, 12 January 2012
by Sarah Ransome
filed under Announcements and Trends and Data
Comments: 10

Best of 2011 is a reflection of the year in music, highlighting the most popular and hottest new artists all based on the tracks you’ve been scrobbling.

This year’s ‘Top Artists’ chart was compiled by looking at scrobbles for albums released between 1st January and 31st December 2011. As in previous years, we aren’t counting live albums, greatest hits collections, EP’s and singles. You might not be all that surprised when you see who’s sat at number one, but dig a little deeper using our lovely new Country and Tag filtering options to find the No. 1 which suits you!

Another new feature for 2011 we’re really excited about is our ‘Top New Discoveries’ chart. This was compiled by looking at the number of listeners for artists who had their first scrobble between 1st December 2010 and 31st December 2011. Discovering new music is core to the Last.fm experience; so we wanted to highlight the artists who caught your attention this year and who you should keep an eye on during 2012. Again, use the filtering options to personalise your view.

Additionally, we took a look at the Year In Music to see what our data had to say about 2011. We hope you’re as fascinated as we were by the impact of music news on your scrobbles.

For developers, we have provided the chart data as TSV and XML files. Download and start hacking, we’d love to hear what you come up with.

Finally, as a little easter egg, we’ve created a commemorative poster of this year’s New Discoveries chart. The eagle-eyed amongst you will notice that it’s slightly different to what you see online; we made this before taking all of December’s data into account. You can download the poster here.

New Charts section!

Wednesday, 14 December 2011
by Sarah Ransome
filed under Announcements and Trends and Data
Comments: 18

Our Charts section has been a bit neglected of late. We’d all got a bit fed up of seeing Coldplay, Radiohead and Adele lead the ‘Top Artists’ chart week after week, especially when we could see our Hype Charts and internal data was telling a far more compelling story. So we decided to do something about it.

This week we have launched the first in a series of improvements to our charts section to make them more relevant, giving you a more dynamic picture of what is popular from week to week.

What’s live now?


The most important set of charts is now our Hype Charts. The Hype Charts are core to what we do at Last.fm – drawing attention to upcoming artists – so it was an easy decision to make these more prominent.

We’re also emphasising how much things change in our weekly charts by making it easy to go back and view them by a weekly pull-down menu.

Each chart now has its own page, and we’ve added buttons to each entry so you can quickly add artists to your library, love them, buy their music or add tags.

What’s next?

Every year at this time, most music sites give you a run down on the best acts of the year. We’re also going to have a Best of 2011 feature, but we have pushed it back to January this year in order to include a full year of data. While everyone else’s lists are pretty similar, we think you’ll be surprised by the story that Last.fm’s data is telling about 2011.

We hope you enjoy these changes and we look forward to hearing your feedback.

London Music Hack Day — our audio API put us in the driver's seat

Tuesday, 6 December 2011
by Matthias Mauch
filed under Trends and Data
Comments: 11

The Music Hack Day didn't just see us arrive with a lot enthusiasm, but also with a brand new API extension that exposes audio features, similar song playlists and Spotify URIs. And we won prizes!


Photos by Thomas Bonte

All awesomeness hype aside, the Hack Day really was a nice experience, and even the 3 hour marathon that was Sunday's demo session was a joy to watch because of the great quality of the hacks. It was my first hack day, and I was truly impressed (see Wired's and Insider's take on it). So what did we do?

My oh my, an API!

You may have noticed from my previous blog posts (Anatomy of the UK Charts, Parts 1, 2, 3, 4 and 5) that we have put quite a lot of effort into finding a mix of well-tested and newly developed audio features that capture distinct attributes of audio recordings, such as energy, harmonic creativity and smoothness. Just to be totally clear: no Last.fm tags and no Last.fm scrobble magic are involved, only pure audio features, retrieved directly from the original recordings.

We calculated 21 of these features on 2 Million of our most scrobbled recordings and Mark built a neat, very fast service to host them. Since Friday this service has been publicly accessible through our outward-facing Last.fm API, thanks to Duncan's API magic. You can either ask for certain feature ranges and retrieve a list of songs that satisfy them, or you can retrieve the audio features themselves by providing the track's artist and title. Of course, bringing even the shiniest of APIs doesn't qualify as a hack...

Driver's Seat — steer your music playlisting!

Since I'd been very impressed with Spotify's new app integration I persuaded Sven to help me build a hack that nicely exposes how good our new API is at audio feature playlisting. And because it puts you in control of steering your music we called it Driver's Seat (screenshot). Below you see a video of the resulting Driver's Seat Spotify app in action.

According to your preferences you select a preset, or adjust feature sliders and hit "Go get playlist!" and the app will fire a http request to the Last.fm API that looks like this

http://ws.audioscrobbler.com/2.0/?method=track.findbyaudiofeatures&filter[]=bpm:80:91...

The result is a list of tracks that we then get the Spotify URI of using another brand new API of ours that loves requests such as this:

http://ws.audioscrobbler.com/2.0/?method=track.getPlaylinks&artist[]=radiohead&track[]=creep...

We really liked our hack because it allows music discovery to be uninhibited by artist genre or history — it just gives you the kind of music you request. The Spotify team liked it so much that they gave us their hack prize, which we share with a hack called CTRL — two of the 18 Spotify hacks.

PitchFork Effect

Sven and I weren't the only ones hacking away though. Alex produced some intriguing visualisations of how PitchFork reviews influence Last.fm listening stats... and received one of the two prizes from MusicMetric. Marek also made a cute little virtual album store as an antidote to the all too modern iTunes and Amazon stores. And Coffey re-worked a previous hack of his to scrobbling tracks at gigs you go to: it uses the set lists available through setlist.fm's API—find the hack here.

Mo Data Mo Problems

Tuesday, 8 November 2011
by Stefan Sperber
filed under Trends and Data
Comments: 8

For the past six months, I’ve been interning with Last.fm’s data team and conducting research for my bachelor’s dissertation. My interest lay in working with Big Data: learning more about data analysis, processing massive amounts of data and solving problems that arise from working with data sets that are too large to be stored on one computer. And what better way to do this than to leave the cosy academic world of my university, move abroad and to write my final assignment on a new field in a foreign city, right?

Let me give you a personal account of the work involved for my research on Efficient Record Linkage with MapReduce in Very Large Data Sets.

Y U NO START WITH EXAMPLE?

Say for some reason you are given two sheets of paper with customer data. The first one contains the customers’ names and addresses while the second one lists their names and the results of a recent survey on their favourite colour. Your task is to connect all customers with their favourite colour.

You start with the first line on the one sheet, look at the name and try to find what this customer answered as their favourite colour on the other sheet. You continue doing this until you reach the end of the list.

I came for the view and the free biscuits.

While doing this customer record matching you notice a few things. Some customers have listed their favourite colour more than once, sometimes it is the same colour but sometimes it is not. The address data also is far from perfect; there are customers with similar names that all live at the same address. Conversely, one customer with a very unusual name seems to own several houses in the same street. These problem cases demand that you decide whether the same person is meant or not. If so, you pick one address and colour and write down each connection just once.

Now imagine this matching thing becomes a regular task that you have to complete at the end of each week. Oh, and the size of your customer data has also magically increased; it does not fit on two sheets of paper anymore but suddenly takes up billions of them. In this case it is time to invite your friends over to help you with your matching task.

You will have to agree on how you handle duplicate customer records and also distribute the work equally. Then, how do you even determine all the matches in these vast amounts of paper? I mean, hanging out with friends is excellent but going through many pages looking for all occurrences of just one customer quickly becomes dull, as most comparisons will be made unnecessarily. You will have to think of a way to minimise work.

“Unnecessary” quotes

This example illustrates some of the issues we face daily at Last.fm. We often have to integrate data that we received from our partners into our own music catalogue.

For example, in order to provide those sweet links to Spotify, Hype Machine, Amazon and iTunes on track pages we have to find corresponding entries that relate to similar artists, tracks, or albums in two or more data sets. Generally, this task is known as record linkage, which is a very active research field.

Links, links, links.

The specific question posed for my dissertation was: “What approaches are there that we can use to improve our data matching tasks at Last.fm?” My findings and conclusions will be used in the future to do this.

I first compiled a list of promising and interesting techniques and evaluated them in a small scale. These included approaches for pre-grouping entries that share a certain similarity in an efficient manner to later minimise the number of comparisons that need to be made (for example, an inverted index and a spatial index) and several metrics that can state how similar two entries are (for instance, metrics introduced by Levenshtein and Dice, but also approaches that first map strings to vectors and then measure the enclosed angle). This allowed me to come up with three combinations of techniques that performed best for our kind of data.

Still, the problem of scale remained, as working with large files and data sets introduces another layer of complexity to the initial problem of matching data. In recent years, MapReduce has become the number one choice for working with Big Data. One reason for its success is that MapReduce makes it very convenient to distribute data processing over a number of computers. Instead of having one computer doing all computations one after the other, many computers can work on small tasks at the same time, and the combined efforts generate a final result. The most commonly used implementation of MapReduce is Hadoop.

Valient Thorr!!!

MapReduce removes a lot of work for the programmer (for example, writing code that distributes work, collects results and reacts to failures), however it also demands that a problem must be expressible within the constraints that MapReduce introduces. These make it necessary to investigate if MapReduce really is the right tool for a given task. For example, techniques that worked well with small amounts of data might suddenly not perform as before when the size of the data is scaled to a certain size.

For the adaptations of the previously identified combinations to MapReduce, I switched to the Cascading framework. As mentioned, when you develop a program for MapReduce you will have to “think” in its programming model, which can sometimes be a painful and slow process. Cascading, however, abstracts the underlying MapReduce model using workflows and allows one to write very complex distributed programs in shorter time. We have been using Cascading extensively in the data team and we love it.

In brief, my findings were that you shouldn’t rely on MapReduce alone for data matching, as the record linkage process is difficult to map in whole to the MapReduce model. For example, the biggest performance bottleneck was the sorting and distributing of entries prior to making the comparisons — the step that is supposed to speed up the matching by pre-grouping entries with a certain similarity. I concluded that it is better to introduce another system for storing intermediary results (for instance, a distributed key-value store like memcached) or to evaluate other approaches that I didn’t have time to cover in depth.

Gain experience points, spend them all on coffee.

Last.fm is a dedicated bunch of people and it was great to learn from them how to tackle a problem properly. This environment quickly drew me in and motivated me. I remember sitting at my desk during my first afternoon and thinking: “This is exactly what I have been looking for.” I was trusted with steering my research in the right direction. I was in total control of my decisions, and could freely experiment and make mistakes (on one occasion one of my experiments on the Hadoop cluster went berserk and managed to hog terabytes of storage space), and everyone on my team supported me as much as possible; they were always approachable, no matter how busy they were.

Not sure if everyone shaking or drank too much coffee.

I enjoyed every day although it was, of course, much more work than I had expected; I was fine-tuning my dissertation and polishing my paragraphs right up until delivering it to the printers. Then, on Friday about one month ago I finally submitted it to my university.

What else do I take with me from six exciting months in London? I went to lots of great gigs and consumed vast quantities of excellent coffee. I was introduced to many people and ideas, had countless interesting conversations and got a good introduction to the local tech scene.

I must also have made a valuable contribution to my team. That is why my stay in the data team has been extended for a couple of months (“at least”, to quote a colleague). After all, there’s just so much more to learn.

Anatomy of the UK Charts. Part 5 — King of Gear Shifts

Friday, 26 August 2011
by Matthias Mauch
filed under Trends and Data
Comments: 12

In a recent interview with the Guardian the young British pianist Benjamin Grosvenor voiced his frustration about his brother’s taste in music: “Those modulations at the end of the songs! They've sung it all already, and then to create a greater emotional effect, they put it up a tone.”

By “they” he’s referring to his brother’s favourite band, Westlife, and the modulations he describes are an old trick you can find in many a songwriters’ toolbox: the gear shift!

Hall of Shame

So is Grosvenor an intellectual snob, an arrogant piano kid? Well, he seems to be quite a nice guy, and he certainly isn’t alone in his disdain of gear shifts. There’s even a website which features a Hall of Shame of supposedly abhorrent examples of this phenomenon, eight of which are by Westlife. The book author Wayne Chase, too, dedicates a section of his songwriting manual How Music Really Works to what he calls “Shift Modulation”, and the section’s heading warns the eager reader in large letters: “Don’t Do This!”.

So let’s have a look at the symptoms. The video below demonstrates the 1-semitone gearshift in Westlife's song “I'm Already There”, see if you notice when it happens.

The chroma visualisation at the bottom of the video shows you which notes (from A to G#) are present in the piece at a certain time. You can easily spot the point where all notes move one semitone up, can you also hear it? The video also suggests that there may in fact be a few good reasons why some musicians should find gear shifts hard to bear.

Firstly, gear shifts are easy to compose. If you have a song in the key of Eb major (as in the above example), then all you need to do is play/sing everything one semitone higher from a certain point; in this video, the song shifts from Eb major to E major, and that’s pretty much it. This makes gear shifts a relatively superficial means of creating complexity in a song, much easier to accomplish than, say, a whole new part of a song, or indeed a more complicated key shift. The reasoning is then: if you need such a simple means of making the song more interesting, it can’t have been interesting in the first place.

Secondly, gear shifts actually sound very cheesy. They have a predictably uplifting feel, so they tend to be used in sentimental songs such as “I’m Already There”, or crowd-pleasers like Bon Jovi’s “Living On A Prayer” (check the gear change at 3:24 in this video).

And there’s a third reason, which we will find out about soon with the help of our music processing methods.

Gear Shift Police

Since the last blog post, we’ve filled a few holes in our collection of UK charts recordings, and I have looked a bit more into harmonic descriptors of audio. One of the outcomes is a gear shift detector.

Like our measure of harmonic complexity in an earlier blog post, the gear shift detector is based on the chroma feature that you saw in the video above. By matching the chroma feature of a song section to profiles of musical keys it is possible to estimate which key fits best. The gear shift detector makes use of this technique: it goes through all song positions and matches two kinds of key profile pairs to the data: those that model a gear shift (the key after the song position is one or two semitones higher) and those that don’t (the key stays the same). If the best gear shift model fits better than the respective model without a key change, we have good evidence for a gear shift.

In addition, to filter out tracks which fit both models badly, we use a feature which is sensitive to any large scale modulation, but will remain low if there is no such large scale modulation. We ran the gear shift detector on the whole charts database, and found a strong trend that would please our gear shift haters!

The proportion of songs with gear shifts is substantially declining over the history of the charts, from a staggering 15% in and around 1960 to consistently lower than 4% in the first decade of the current century.

The high frequency of gear-changing songs before 1970 is our best guess for the third reason musicians dislike them: there are just too many of them. Perhaps gear shifts were overused? For the moment it’s mere speculation to attribute the decline of the ratio of gearshifting songs itself to their high frequency in the early days of the charts, but it is quite easy to imagine that they just ceased to be special (if they ever were).

Merry Gear Shift Everyone

I wonder at which point the gear shift turned from a relative novelty to an established songwriting tool, rendering anyone who uses it less ‘cool’? Even The Rolling Stones, definitely one of the coolest bands of their time, could get away with 'shifty' songs, as can be heard in this excerpt:

Rolling Stones - Come On

However, later on, gear shifts seem to have become irreconcilable with artists who consider themselves to be cool. For example, our detector does not find a single U2 hit with a gear shift. And it’s conceivable that consumers, too, started considering themselves as cool and to shun gear shifts. However, there is a time of year where songwriters seem to catch music buyers off guard — at Christmas, as the figure below impressively illustrates.

The graph shows the percentage of gear-shift songs in the months they hit their highest position. It is substantially higher in December than in all other months, and more than twice as high than in September. It doesn’t come as a surprise, then, that considering only tracks that feature the word “Christmas” in their title even has a gear shift ratio of 31%.

I’m Going To Make A Change For Once In My Song

I personally think we shouldn’t be so harsh as to condemn all gear shifts in the charts (though if you’re interested in doing so here’s the list) — there are some true gems.

Some of you might have recognised the section heading as a (slightly punned-up) line from the song “Man In The Mirror”, famously performed by Michael Jackson. While Jackson was never in danger of out-gearshifting Westlife, he certainly came up with some juicy specimens. “Man In The Mirror” does in fact contain one, nicely placed on the lyric “change!” (see video, gear change at 2:52), but really that’s just a warm-up exercise. Other examples include “Rock With You” (video, 2:31), “Earth Song” (video, 3:46), and the more recent “Cry” (video, 3:11).

During the last decades of his career there was also a tendency to increase the number of gear shifts per song. The songs “You Are Not Alone” (video, 3:31, 4:10) and “Heal The World” (video, 4:33, 4:58) include two each, and “Will You Be There” takes the prize with three (video, 2:06, 2:30, 2:53), making the King of Pop the true King of Gear Shifts. The figure below shows the chroma representation of the first gear shift in Will You Be There, click to see a longer excerpt.

And why not? To be sure, a song has to offer a lot of other goodness to justify gear shifts, but maybe I could even convince Benjamin Grosvenor that without them pop music would be poorer. I quite like the effect it produces, and as long as you don’t overdose...

But tell us what you think! Would you like to be able to exclude gear shift songs from your Last.fm radio? Or even seek them out?

Anatomy of the UK Charts series so far

Percussiveness and the Disco Diva - on the rise of disco in the mid 70s

Clash of Attitudes - on automatically telling punk from art rock

The Curse of the Drum Machine - on how 120 bpm dominated the 80s

Survival of the Flattest - on the Loudness War and decline of dynamic range

King of Gear Shifts - this post

Further info

If you want to visualise chroma for your songs, check out the free Sonic Visualiser, and get the free NNLS Chroma Vamp plugin.

Anatomy of the UK Charts. Part 4 - Survival of the Flattest

Friday, 15 July 2011
by Matthias Mauch
filed under Trends and Data
Comments: 15

This week Matthias has let loose his signal processing tools to track the history of loudness in the UK singles charts. He shows in detail how pop music has become louder and flatter, and explains why loud doesn't have to be noisy.

There's no doubt that music keeps changing. The music psychologist Carol Krumhansl recently conducted an experiment in which she played 400 millisecond snippets of pop music to individual participants. Surprisingly, from this tiny amount of information her participants could often predict the decade the music had been created in - even if they did not recognise the track itself.

We assume that changes in musical style are motivated by fashion or social factors, and not due to developments in the recording or post-production process. For example, the rise of punk music (as traced in my second blog post) appears to have been triggered mostly by social factors. Old-school disco bloomed and faded quickly, like a fashionable jeans cut. But when it comes to the 80s things are less clear: we suspect that the introduction of new technology, specifically drum machines, substantially changed dance music styles, and find some evidence for it (third blog post).

So, are music fashions caused by people wanting to listen to some music more than to other? - They like it, buy it, and it gets into the charts? Well, probably, but only until record producers found out that they could make people listen to their music, simply by making it louder, so it stands out from the crowd. And that's exactly what they did, spurred by the advent of the CD in the 1980s. Just as trees evolve to grow higher so they can outgrow other trees in the race for sunlight, music grew louder and louder in the race for attention. At least that's what people have found examining many examples of popular music. There's an excellent Wikipedia article on this phenomenon dubbed the Loudness War. Is the phenomenon really as wide-spread as we think it is? Can we really find increased loudness in the charts, and can we track down loudness's evil brother dynamic range compression?

The Higher Level

An audio engineer's measure of loudness is decibels relative to full scale (dBFS), and it's really just a measure of audio level. Essentially, dBFS is the logarithm of the energy in an audio waveform, minus the value of the loudest sine wave that you can fit on the recording medium (the “full scale”). So a full scale sine wave will have 0 dBFS, which is very loud, and most other sounds will have negative values.

If a music track is well-engineered, then the loudest samples of the waveform are already nearly at maximum level, so you can't make it much louder without changing the shape of the wave form, i.e. without making it sound different. If you still want the track to sound louder, you have to shape the wave form so that more parts of the signal approach maximum level. The route many mastering engineers have therefore taken is to squeeze the waveform peaks down and use the resulting room to blow the whole signal up again - this is dynamic range compression, and it means that the signal gets louder on average.

Below we have plotted the average dBFS value for tracks from the UK singles charts from 1964 to 2009, as a cloud of very light grey dots (the highest 5% and lowest 5% are hidden for better visualisation of the rest). In order not to get fooled by more or less normalised tracks, we subtracted every track's maximum dBFS value. The red curve we plotted on top is a local regression curve, complete with the dashed simultaneous confidence band (at 99% confidence, see Further Reading, below). The tight confidence band shows us that the real underlying curve is unlikely to be far from the one we estimated.

The average dBFS value for a typical 80s charts song was around -18 dBFS, but things have changed since then. Our data shows that the loudness war definitely happened, and it started shortly after the introduction of the CD in 1982. We marked Dire Straits' Brothers in Arms (1985), the first CD album that sold a million copies, in the plot. From there, it just goes up and up, to an average level of about -15 dBFS in 2009, or 3 dB higher than in the mid-80s. You can see that in the 70s, too, the dBFS values were relatively high. We believe that recordings of that time simply never had much dynamic range, due to limits of recording technology. Not everyone will agree with using dBFS as a measure for loudness though because it does not take into account the way humans perceive loudness at different frequencies. Do we have a measure for that, too?

It's Getting Louder all the Time

There are indeed computer algorithms that imitate how humans perceive loudness (see Further Reading). We applied such a measure of loudness to our singles charts tracks. Again, in order to make this measure independent from the maximum value used in the track we corrected for maximum dBFS value (this time by using linear prediction and plotting only the residuals). Measuring loudness this way shows a slightly different curve, an almost uninterrupted rise of loudness from the seventies through to the first decade of the 21st century (find the figure here). A more intuitive way of thinking about it is this: we take a loudness value that's very high at the beginning of our year range (in 1964), the 90% quantile. That means that only 10% of the charts in 1964 are louder than this value. We've then plotted the percentage of the tracks that were louder than the 1964 value for every year of the charts in the figure below.

The percentage of loud tracks has increased from 10% in 1964 (by definition) to over 40% in recent years. So music has got louder. Well, isn't that in the spirit of Rock'n'Roll? Sadly, it isn't, because the increase in loudness has led to worse sound quality. Granted, it's louder, but boy is it flat!

The Death of Dynamic Range

If you fight the loudness war, whoever your competitors are, your victim will be dynamic range - an important part of sound quality. Here's why: I've already described to you the process of making a tracks sound louder by compressing the peaks, then blowing it up again. And the problem is just that: the peaks will sound compressed relative to the average recording. Drums have less punch, song sections intended to sound really loud will be at the same level as softer sections. This is all described very well in the Wikipedia article on the Loudness War. So we wanted to look at the development of dynamic range in the charts. We have actually measured the dynamic range of the music by a measure called crest. On every one-second block of a song, the crest is the difference in dB between the maximum value and the mean dBFS value. As in the figure above, we have plotted all tracks as a cloud of grey points, with the local regression and 99% confidence bands overlaid.











 
 
 
 
 
 
 
 
 
 
 
 
 

... and what we found exceeded our worst expectations. The charts tracks lost around 2 dB of dynamic range in the 20 years from 1985 to 2005 despite ever-improving technology, in energy terms that's 20% dynamic range lost. The picture is even more depressing when you look at individual artists. Madonna's tracks from the 80s have crest ranges of around 13.5 dB, with many tracks exceeding the average. But she goes with the loudness trend and gradually kills off 3dB of dynamic range, with her later recordings scoring around 10.5 dB. Hard to blame Madonna though, she's not alone! U2 seem to have resisted the trend for a long time, but then they fully embraced it, with their latest hits among the least dynamic of all. Oasis never had much dynamic range to begin with, so they've just kept on doing their thing, one might argue. Take That disbanded in 1996, then re-formed in 2005, but it looks as if they'd been going on all along: they're almost exactly on the loudness trend. Quite refreshingly, Robbie Williams and Moby do not seem to have followed the trend: while not topping the dynamic range chart, their tracks have actually grown more dynamic over time. And Beyoncé must be the queen of dynamics, most of her songs have well above the average dynamic range.

One of the arguments designed to convince artists not to make their music as loud as possible (beside the quality argument), is that modern software music players (including Last.fm's) adjust the volume so that loudness differences are less of an issue than they used to be. And some artists seem to have learned their lesson. The Red Hot Chili Peppers, for example, were criticised for their incredibly loud, and hence poor-in-dynamics, album Californication. Consider the very low dots with only approximately 9.5 dB dynamic range around the year 2000 in the figure above (with Red Hot Chili Peppers selected), way below even the average pop dynamic range. It's nice to see, however, that they steered back and their later recordings are more dynamic again.

compressed vs. dynamic
 Red Hot Chili Peppers - Give It Away, 1991, very dynamic (15.0 dB crest)
 Red Hot Chili Peppers - Californication, 2000, very compressed (9.2 dB crest)
 Red Hot Chili Peppers - Dani California, 2006, quite dynamic again (11.2 dB crest)

“No, no, I mean, are they LOUD?”

You might have noticed that artists whose tracks have a low dynamic range are not necessarily renowned for being loud. Is Madonna louder these days than Beyoncé? The “loud” that we think of when talking about bands is concerned with the volumes involved while making the music, that is, however loud you mix a Take That track, Megadeth will sound louder. Cues for a band being really loud are distortion (distorted guitars in particular) and prominent cymbal sounds. The spectra of such sounds have the characteristic that spectral peaks are not very prominent, and that there's an emphasis on high frequency content. As it happens, we can measure these, too. Our inharmonicity metric measures the prominence of broadband noise relative to peaks in the spectrum, and high-frequency content is detected by a low 1st MFCC coefficient. Normalising these metrics and taking their geometric mean gives us a measure of noisiness. The picture below was compiled by positioning artist names by noisiness versus loudness. The font size and opacity of the names correspond to the number of singles in the UK charts.

So maybe the combination of loudness and noisiness gives you a better indication of the kind of “loud” that you like. Does it?

While today's post has been quite technical, next time we're going to look at the lighter side of things, with a special on songwriting.

Anatomy of the UK Charts series so far

Percussiveness and the Disco Diva - on the rise of disco in the mid 70s

Clash of Attitudes - on automatically telling punk from art rock

The Curse of the Drum Machine - on how 120 bpm dominated the 80s

Survival of the Flattest - this post

Further reading

The Wikipedia article on the loudness war.

The Echonest's Paul Lamere has also written a nice article showing evidence for the loudness war with many great examples.

Carol Krumhansl's article on the musical memory and the 400ms pop snippets: Plink: "Thin Slices" of Music, Carol L. Krumhansl, Music Perception: An Interdisciplinary Journal, Vol. 27, No. 5 (June 2010), pp. 337-354.

If you want to calculate loudness, inharmonicity and crest from a music file, try the libxtract Vamp plugin.

Local regression curves and confidences bands can be calculated using the locfit library, most easily using the interface to the R programming language.

Edit: The technical arm of the European Broadcasting Union EBU has a website with recommendations on the measurement and normalisation of loudness.

Anatomy of the UK Charts. Part 3 - The Curse of the Drum Machine

Friday, 1 July 2011
by Matthias Mauch
filed under Trends and Data
Comments: 13

This is the third part of the series, in which our Research Fellow Matthias analyses thousands of recordings from the UK singles charts using audio signal processing techniques. You can read Part 1 here and Part 2 here.

How does music change over time? Does it get faster, more complex, more diverse? In the course of the last few weeks we have learned that many aspects of music don't seem to follow that kind of simple trend. Rather, changes happen in less predictable ways. One of the most surprising examples of that emerged when we looked at the development of rhythm regularity in the UK charts...

The 80s Regularity Hump

The figure below visualises data that we've already used in the first part of this series to look for disco songs: rhythmic steadiness, or "rhythm regularity", as we will call it in here. To measure rhythm regularity we have written some audio analysis code that outputs a high value if the rhythm in an MP3 file often changes between consecutive 16 second sections, and a low value if consecutive sections don't change much. Our regularity extractor is based on a mathematical description of the rhythm in every second of a track known as Fluctuation Patterns (see Further Reading, below). When we plotted rhythm regularity over the years, to our surprise we didn't find a simple trend in either direction...

Instead, we found a regularity hump, right at the beginning of the 80s: the dark blue area shows the proportion of the charts in every year that is taken up by the most rhythmically regular tracks. Around the start of the 80s, the proportion of the most regular tracks increases noticeably, while the proportion of the most irregular ones diminishes. What could have fuelled this trend? One obvious suspect is the introduction of new technology. We have marked the release date of the first widely used drum machine, the Roland CR-78 in '78, and the first influential digital sampler, the Linn LM-1 in 1980 in the figure. We can't prove the connection but it's certainly a striking coincidence. Why does the proportion of highly regular tracks wane again in the mid 90s? Maybe people were fed up with very rhythmically regular music, or maybe drum machines simply got better at producing more diverse rhythms.

In either case, if this hump is really related to drum machines and samplers, we'd expect to see the trend in other kinds of data as well.

The 120 bpm Tempo Crunch

As the 80s seems to have been an unusual time for rhythm, it would be no surprise if we also saw striking things in the tempo of 80s music. We ran our tempo tracking software on the whole charts collection to measure the tempo of each section of each track in beats per minute (bpm). We then picked a single bpm value for each song by choosing the one attached to the most seconds of audio. At a first glance a plot of the average tempo per year didn't look very interesting (you can see it here). But then we noticed something unexpected. Have a look at the image below.

The heat map above shows you which tempos (in bpm) were unexpectedly popular in which years. For example, see the dark orange dot around tempo 100 in 1969? That means that tempos around 100bpm were more popular than usual that year (up by more than 2 percentage points). Admittedly, what's more striking is that there seems to be a downward pattern, as if some kind of music was getting slower and slower over the years. There's also a a strange cluster of tracks around 120 bpm from the early 80s to the early 90s - the big red blob in the middle.

Some tracks from the red blob:
 Prince - 1999 (120 bpm)
 Frankie Goes To Hollywood - Relax (117 bpm)
 808 State - In Yer Face (120 bpm)

Before we ask what this cluster means, let's make sure that this trend is really there, i.e. that the proportion of tracks around 120 bpm is really higher from 1982 to 1991 than in the rest of the time. It may seem obvious from the figure below, which plots the proportion of tracks between 116 and 124bpm over time. To be really confident that this is not due to chance we use a statistical test, a "2-sample test", which can check whether the difference is significant... and it is, with 95% confidence we can say that the difference is between 5.5 and 8.1 percentage points (details).

Our first thought was that songwriters in the 80s must have turned on their drum machines, loved what they heard and wrote a song to that beat - without changing the default tempo setting of 120 bpm. I would love this to be correct, but I have a hunch that it's not, especially after having found this highly interesting manual for writing a hit single written by The KLF in 1988. They say that "the different styles in modern club records are usually clustered around certain BPM’s: 120 is the classic BPM for House music and its various variants, although it is beginning to creep up", and also, "no song with a BPM over 135 will ever have a chance of getting to Number One" because "the vast majority of regular club goers will not be able to dance to it and still look cool".

In this track the KLF follow their own advice.
 The KLF - 3 a.m. eternal (120 bpm)

It seems that the KLF have a point. We wanted to know, and compared the tempo estimates to our Last.fm tags, especially the "Dance" tag. The figure below suggests that the KLF were really right. We took all tracks within ranges of 10 bpm (75-85, 85-95, ... , 145-155) and plotted the proportion of them that were tagged "Dance". It turns out that the KLF had quite a good grasp of what was true from 1982 to 1991. The proportion of tracks tagged as Dance is clearly highest around 120 bpm...

So was the drum machine a curse? It's been a blessing for many dancers, and it has undeniably led to the emergence of whole new genres of music, often dance music. And dancers have taken it from the club to the charts. Incidentally, somewhere along the way they must have learned to look cool even at higher bpm rates... what we see for the years 1992 to 2001 suggests a rather radical change to quicker dance music quite different from the KLF's suggestion that the tempo is "creeping up". Check the figure below.

Dance music in the 90s has clearly moved on, even the songs that did get into the charts, and its got faster. However, there seems to be another current of music that counter-balances this, as our tempo heatmap further up suggests (though it doesn't prove it).

As a reward for trying to understand all our data visualisations today we leave you with two of our new multi-tag radio stations.

Last.fm 80s + Dance Radio

Last.fm 90s + Dance Radio

Next week, bring some some earplugs, as we will try to find out where all the racket comes from in The White Noise Boys [Edit: title changed to Survival of the Flattest].

Anatomy of the UK Charts series so far

Percussiveness and the Disco Diva - on the rise of disco in the mid 70s

Clash of Attitudes - on automatically telling punk from art rock

The Curse of the Drum Machine - this post

Further Reading

The original article on the Fluctuation Patterns feature is by Elias Pampalk, Simon Dixon, and Gerhard Widmer: On the evaluation of perceptual similarity measures for music. In Proceedings of the Sixth International Conference on Digital Audio Effects (DAFx-03), pages 7–12, 2003.

If you want to detect the tempo of a track automatically, try Matthew Davies's tempo tracker in the Queen Mary Vamp plugin library or Simon Dixon's BeatRoot.