Do you have Robot Ears?
That’s the question we asked a few weeks ago. As we explained at the time, we’ve been training an army of music-listening robots (or “audio analysis algorithms” if you want to get technical!) to try to better understand the music you scrobble.
The idea is that by automatically analysing tracks we’ll be able to add helpful tags, improve recommendations, and provide novel new ways to explore your collection and discover new music.
We asked for your help to evaluate our robots. We thought they were doing a pretty good job in most cases but there was definite room for improvement, and like any good scientist we were looking for some large-scale evidence (i.e. lots of feedback from real people) rather than just going on our own impressions. So we built the Robot Ears app which asks humans to classify tracks and then compares their answers with what our “robots” said about the tracks.
Click to try the latest Robot Ears
Now, six weeks later, we’ve gathered over 30,000 judgements on 600 or so music tracks and we’re ready to share some initial results.
*Spoiler alert*
The robots did pretty well – but we’re not satisfied yet!
We’re kicking off another round of experiments, to learn even more about a wider variety of music tracks. The more people we can get to take part the better, so whether you’ve tried it already or not, please visit Robot Ears - and help the robots to keep improving!
Want to know what we (and the robots) have learned so far? Read on for the details…
The results so far
We were aiming to answer two different questions with this experiment:
- Are the labels we’re trying to apply to tracks meaningful?
- Do our robots reliably apply the right labels to a track?
The first question is the more fundamental – if we’re using labels that don’t mean anything to humans, it doesn’t much matter what our robots say! To answer this question we looked at the average agreement between humans for each track. If humans reliably agree with each other we can conclude the label has a clear meaning, and it’s worth trying to get our robots to replicate those judgements.
We were looking at 15 different audio “features”. Each feature describes a particular aspect of music, such as:
- Speed
- Rhythmic Regularity
- Noisiness
- “Punchiness”
etc.
The features have a number of categories, for example “Speed” can be fast, slow or midtempo. Each time a human used the Robot Ears app, they were asked to sort tracks into the appropriate categories for a particular feature. Meanwhile our robots were asked to do the same. At the end of a turn, we showed you how your answers matched up with the robot’s:
After we’d gathered about 16,000 human judgements we took a look at the results so far. There were a few interesting learning points about which features were doing better or worse. Based on this we adjusted some of the labels, threw some out completely, tweaked our robot algorithms and started a new experiment. Another 14,000 judgements later we reached the following results:
We can see that the levels of human agreement vary quite a lot across the features, with activity, percussiveness, smoothness and energy seeming to be the most reliable. By the end of the second round experiment there were just a handful of features (rhythmic regularity, sound creativity and harmonic creativity) we still weren’t convinced by. We aren’t giving up on these, but it seems like we don’t quite have the right words to describe them yet!
Speaking of which – we had a side question in each test: “would you change any of these labels?”
We got some interesting suggestions. Some were helpful. Some… less so! For example:
- Noisiness: noisy → distorted
- Energy: soft → calm
- Energy: energetic → powerful, emotional, EXTREME HIGH
- Danceability: dance → strong beat, rhythmic
- Danceability: atmospheric → ambient, spacious
- Harmonic Creativity: little harmonic interest → boring
- Tempo: steady → great workout stuff
- Punchiness: punchy → wide dynamic range
- Sound Creativity: consistent → simple
- Sound Creativity: varied → upfront texture
- Smoothness: uneven → turbulent
One user also suggested renaming the Not Sure box “I’m an idiot”!
So what about the second question: “How did our robots do?” Well, again, there was quite a range of performance across the different types of feature:
As you can see there are a few features our robots are particularly good at, and a few where their ears definitely need to be cleaned out!
What’s next?
Doing these first two experiments allowed us to refine our terminology and the way our robots classify tracks. Naturally, being built in London, our robots are currently very excited about the Olympics. In that spirit we’re going to award them a bronze medal for progress so far:
We’ve already started to work on some new functionality based on the more reliable features. Here’s a sneak preview of what Mark and Graham came up with at a recent internal hack day:
There’s a lot more work still to do though, and so we’re kicking off a third round of experiments. The key difference is we’ll be using a lot more music tracks this time, and hopefully getting a lot more user feedback.
Whether you’ve taken part already or not, we’d love it if you’d come visit Robot Ears – and help our robots go for the Gold!
Comments
Matthias
25 July, 09:04
Very interesting! Well done chaps!
The generally high agreement of humans is encouraging, and I’m glad that the robots got at least some bits right!
Ruthann
6 August, 21:08
Umm sounds interesting :)
weazel74
6 August, 22:37
I’ve just tried to use the application, yet I get re-directed to an error page which essentially states there is no update or application.
PB
6 August, 23:36
why are you advertising this at the top of last.fm if it’s no longer running? sigh, there’s no end in sight for your team’s incompetence…
madnessxd
7 August, 19:44
It would be nice if the link would work.
Liz
8 August, 08:54
You always share the most valuable information. I really appreciate it.
Learnt something new about “Advanced Robotics”
Zombiproof
8 August, 17:43
Deffo wanna get involved in this think I have something to bring to the table :)
Merp the Derperor
10 August, 04:27
Advanced Robotics… way to glorify being a Radiohead fan, huh?
beefsoup
10 August, 14:11
Way cool! Last.fm has the best algorithms. :)
Mateus Souza
10 August, 22:12
Too bad I missed that, hope to catch the next evaluation to help you guys improve the algorithms you are developing
SkinLikeSand
10 August, 23:51
I missed that too, unfortunately. Can you give some more technical information about the classification algorithm?
pdrizzly
11 August, 05:05
I lost interest in reading this about halfway through, but I’m commenting cuz I’m insanely bored. Sounds kinda cool from what I read though
Keef
12 August, 04:24
Don’t forget to make an anti-algorithm. I don’t want to listen to stuff that sounds the same. I want as different as possible.
Pancade
12 August, 11:18
Hey guys,
You’re promoting the post (links through all last.fm pages) so I suppose many people visit that page.
But the link to Robot Ears http://playground.last.fm/demo/evaluator/ate is broken and says “Sorry, no new trials available right now!”
Raihan
12 August, 15:42
In music, a song is a composition for voice or voices, performed by singing. A song may be accompanied by musical instruments, or it may be unaccompanied, as in the case of a cappella songs. The lyrics (words) of songs are typically of a poetic, rhyming nature, though they may be religious verses or free prose.
A song may be for a solo singer, a duet, trio, or larger ensemble involving more voices. Songs with more than one voice to a part are considered choral works. Songs can be broadly divided into many different forms, depending on the criteria used. One division is between “art songs”, “pop songs”, and “folk songs”. Other common methods of classification are by purpose (sacred vs secular), by style (dance, ballad, Lied, etc.), or by time of origin (Renaissance, Contemporary, etc.).
shortbutfast
12 August, 23:48
“Error
Sorry, no new trials available right now!”
typical for lastfm of late.
why do you keep the link on top if it’s not working??
Spartacus
13 August, 06:24
Totally Experimental, dude!
eoguy
13 August, 14:51
Was looking forward to trying this. Too bad it’s all talk up to a broken link. :(
Jinx
13 August, 19:25
gotta love that they’re saying “we’re kicking off a new round of experiments” with a link that ends up going to a page that says “sorry no new trials”.
Mark Levy
14 August, 14:59
Many apologies to anyone who got the “no new trials” message, this is fixed now.
The problem was caused by us getting a lot more visits than we expected in such a short time, for which thank you! Unfortunately the person responsible (me) was on holiday for the last ten days, otherwise it would have been fixed sooner.
sur
16 August, 15:33
still not fixed for me…
David Brown
17 August, 18:35
Would be more informative to see the charts as a scatter-plot of human-robot conformity on one axis and human-human agreement on the other.
Arcticbird
18 August, 06:27
Who cares what robots or humans have to say about the way music sounds? Everyone is going to think differently about these specific aspects and we all know that robots always get errors. Maybe I’m just too stupid to see the importance in this project but what tools are already available (like the wonderful last fm radio) on last fm to help people find new music they will love similar to what they like is already aplenty and very efficient. Please excuse my narrow-minded opinion and don’t take it too seriously. Thank you :)
Szymanski99
18 August, 17:17
I can’t see the experiment relating at all to classical music. Tags (if correct) seem a far better way of identifying suitable recommendations in this case. After three weeks building up a library most recommendations which get through are appropriate.
paulajlairdband
19 August, 08:25
Have you tried matching algorithms, not sure that would be feasible, but it is plausible, beat for beat with the waveforms separated exactly, and the height exactly..
Automatically finding its mirror image, even though they’re not the same song.. I think you’re happy’s and your sads, and your fast.. Amongst thousands of songs.. Would automatically start categorizing their selfs..
I think as far as naming these algorithms, it will soon be obvious, the tags still won’t change.
Rock , fast rock , rockabilly, and rock ‘n roll.. Etc.
I hope this helps, if it doesn’t it was real fun talking to you anyway
Giovanni Andrew Roverso
19 August, 14:59
excuse me,
it says:
Error
Sorry, no new trials available right now!
:(
Mark Levy
20 August, 14:50
Many thanks for all the feedback.
@Szymanski99 I agree this probably won’t do anything very useful for classical music unfortunately.
@paulajlairdband yes it is possible to propagate tags applied by humans to one track to others that sound similar as an alternative to what we’re trying here, but there are a couple of catches. The first problem is that people haven’t applied tags describing mood or other musical properties to that many tracks, so this still wouldn’t help us tag that much music. The second issue is that finding tracks that sound similar is almost as hard a problem as trying to tag tracks automatically!
@Giovanni Andrew Roverso We’ve released another batch of trials now.
Russ
20 August, 23:20
Ugh I really wish Last.fm wasn’t doing this – one reason I use Last.fm instead of Pandora is the stupidity of Pandora thinking if I like one “punchy song with suble folk influences” then obviously I’ll like all “punchy songs with suble folk influences.” This reduces music to nothing more than little cookie-cutter variables (though, in fairness, I guess that would work for a lot of pop music).
Michael
21 August, 08:38
Very interesting!
Do you plan to make the robot data available in form form (hint: the API) for us developers ?
//Michael
Ouri
23 August, 18:22
OMG OMG
DID LAST.FM JUST MAKE A GRANDMA’S BOY REFERENCE??
When J.P was listening to Aphex Twin- Windowlicker and he asks this asian guy that comes in “Do you like my music” or something like that and he says “No, I don’t really like techno”
“You would if you had robot ears”
DJ ISLAM I-G
28 August, 02:09
My name is Islam Gantery I-G ,Called DJ I-G, DJ & Producer ..ready for work anytime ,I love music ,Start from hurghada 2006 ,and i will stop when i will die ..look..00201068716554 About me Ask me !! ready for help anyone as i can ..https://www.facebook.com/DJ1IG
..http://vk.com/id60321784
..soneloveh@yahoo.com
..http://www.myspace.com/590274959
http://thedjlist.com/djs/ISLAMIG/
Ed
2 September, 11:15
Stupid
Comments are closed for this entry.