All Our Tools Are Belong To You!

Tuesday, 19 February 2013
by marcus
filed under Code and Announcements
Comments: 14

Welcome to the second part of our Open Source series. Today we’re releasing moost, a C++ library with all the nice little tools and utilities our MIR team has developed over the past five years. If you’re a C++ developer yourself, you might notice that moost sounds quite similar to boost, and that’s on purpose. moost is the MIR team’s boost, there is hardly a project in our codebase that doesn’t depend on one or more parts of moost.

There are a lot of different things in moost. Some are really simple, yet very helpful in day-to-day work, like the which template that allows you to use pairs (and containers storing pairs) more easily with standard algorithms; or stringify, a function template that turns complex objects into strings. Other parts are slightly more sophisticated: for example, moost contains the framework that is shared by all our backend services, and that allows you to write a daemonisable service with logging, a set of standard options and even a service shell that multiple users can connect to when the service is running, all in a few lines of code.

As our backend services are inherently multi-threaded, there’s also a bit of threading support in moost. For example, the safe_shared_ptr template is immensely useful for resources that are shared between threads and need to be updated atomically.

If you’re working with large, static datasets, you’ll probably find the memory mapped dataset classes interesting. They allow you to build large datasets (like gigabytes of data) of vectors, multimaps or dense hash maps that can be simply mapped into memory — and thus shared between different processes — and accessed very much like a constant standard container.

moost also contains an abstraction for loading shared objects and instantiating objects defined inside these shared objects. It will take care of all the magic involved to avoid resource leaks.

There are more bits and bobs in there, like a simple client for the STOMP protocol, hashing and message digest functions, wrappers for key-value stores, template metaprogramming helpers and even a complete logging framework. So check it out, play with it and if you’ve got some nice tool to add, please contribute!

There’ll be more code coming up later this week that makes use of moost, so if you’re looking for some hands-on examples, stay tuned!

To be continued…

Build me, please!

Monday, 18 February 2013
by
filed under Announcements and Code
Comments: 8

Here at Last.fm we love Open Source. Most of the time we’re just using a lot of Open Source Software, sometimes we’re contributing changes or fixes back to existing projects, and sometimes we release our own software to the public. This week, we’ll be releasing some exciting projects to the C++ community. The first of these projects is a build system we’ve conceived for our C++ codebase and which has helped us a lot — and it might be useful for you, too!

Last.fm’s MIR team is responsible for maintaining more than a hundred libraries, tools and backend services, most of which are written in C++, although some projects are in Python, Perl or Java. Back in 2011, all these projects had to be built from one giant Subversion repository, they contained hard-coded relative paths to other projects they depended on, yet as developers we would still have to know all the dependencies and build them in the correct order to actually build the project we were interested in. Also, every project contained a lot of boilerplate code and over time, this code changed, so it could be substantially different between any two projects. All of this made it quite painful to build projects or set them up for continuous integration, let alone distribute them to our production servers.

As we were thinking about migrating our codebase to Git, we wondered whether there was an easier way to build our projects. Our ideal solution at that point would have been a tool that allowed us build, test, install and package every project, regardless of the language it’s written in, with exactly the same command. We couldn’t find anything like that and so we decided to write our own tool, which we called mirbuild (for hopefully obvious reasons).

mirbuild is a meta-build-system, which means it’s basically delegating the actual build process to other build systems, but hides this behind a common interface. It is just a set of Python libraries, so the actual build scripts are written in Python. For a simple project, such a build script (usually called build.py) looks like this:

  #!/usr/bin/env python
  from mirbuild import *
  project = CMakeProject('libcyclone')
  project.depends('libmoost')
  project.find('boost')
  project.find('log4cxx')
  project.version('include/cyclone/version.h',
             namespace = ['cyclone', 'version'])
  project.run()

As you can guess from the class name, this project uses CMake under the hood. But if you just want to build the project, you don’t have to care. You just run

  ./build.py build

and it “just works”. But mirbuild does a bit more than just forwarding commands to CMake. For example, it will create a file that controls compile flags and include and link paths of project dependencies. It will also create a version header for your project if you ask it to do so.

Here’s are some of mirbuild’s features:

  • supports CMake (C/C++), Python, Thrift (C++/Python) and “simple” projects
  • can build, test, install and clean up projects
  • can resolve dependencies between projects
  • can create Debian packages
  • can build different configurations (release, debug, coverage) of a project
  • can run code coverage analysis tools

Over the last one and a half years, mirbuild has saved us from a lot of grief and it has made building projects a lot of fun. Thanks to mirbuild, we’ve also simplified our continuous integration framework and have now got all our production packages built on disposable virtual machines (but that’s a different story). If you’re maintaining lots of C++ code and aren’t happy with how you’re building it, check it out, it’s on Github.

To be continued…

Last.fm Desktop Scrobbler Released!

Thursday, 31 January 2013
by michaelc
filed under Announcements and Code
Comments: 26

Hello, scrobble fans! Were you wondering where your desktop app updates had gone? Well wonder no longer! With the last major version released back in 2007 (those were the days, eh?) you’d be forgiven for thinking there weren’t any more coming, but we’ve actually been hard at work on an update to bring us crashing into 2008, a little late.

We released this new desktop scrobbler as a beta a little under a year ago and have been spending the time since getting it ready for launch. A couple of weeks ago (15th Jan) that launch day finally arrived and we pushed it out to everyone on Windows, Mac, and Linux! If you’ve not already got it you can head over to our download page for a fresh copy.

Here’s a Youtube.com video of us reaching 200,000 authenticated users on the new app: https://www.youtube.com/watch?v=vy_VwcGazE4. Just look at how much fun we’re having!

The app comes with a new design and some features we hope you’ll really love. There’s a now playing tab where information about your currently scrobbling track will show up, including related artists, tags, biography, and scrobble statistics. Tracks played from radio stations will also show you a little context as to why the track is being played. A scrobbles tab where you can see a history of what you’ve been scrobbling and find out more about those tracks. A profile tab where you can see your scrobble charts. A friends tab where you can see what your friends are listening to and start their library radios. There’s also a radio tab where you can start all your usual Last.fm radio stations including a history of your recent ones.

We’re looking at the app as a baseline with which we can add and improve upon. There’s been a few ideas bubbling away that we can’t wait to add, but for now the focus is stability. With a large change such as this there are bound to be teething troubles and we’ve been taking your feedback on the client support forum and making sure we address problems and implement anything we might have missed that you loved in the old app.

A reminder that, like our iOS and Android apps, the desktop Scrobbler is open source and hosted on our Last.fm github page (both the liblastfm and lastfm-desktop repositories make up the desktop app) where you’ll also be able to find other things Last.fm have open sourced. If you’d like to get involved with development then head over there and fork us!

It’s been a long road getting to this point and I’d like to thank all the client team members, contributors, and believers past and present for making it happen. You know who you are and you’re all very wonderful!

Last.fm Scrobbler for Linux

We at Last.fm love Linux. Not only does it power almost all of the server machines that bring Last.fm to you, it is also the operating system of choice of many of our developers at Last.HQ. For our desktop application Last.fm Scrobbler, Linux is a first class supported operating system. The source code is available on GitHub if you want to have a go at building it yourself, but we also provide ready built packages for those of you who are using Debian or Ubuntu. Just go to http://apt.last.fm and find out how to install them. Today we release an updated set of packages featuring the latest version of Last.fm scrobbler (2.1.33).

We are also proud to release official packages of Last.fm Scrobbler for the Raspberry Pi today. If you have not heard about Raspberry Pi, it is an ambitious project to bring better teaching of programming and the technology behind computers to children. The Raspberry Pi Foundation is a charity that has designed and developed a mini computer that costs less than £40 and allows not only children to dive into the world of computer programming. Being so cheap, the Raspberry Pi has also attracted many hackers to make new things based on this mini computer. Media centre solutions are already very popular, which is not surprising because the Raspberry Pi has a network interface and video and audio outputs. We now contribute our Last.fm client application to the Raspberry Pi universe. If you have a Raspberry Pi and are running the Raspbian operating system on it, then head over to http://apt.last.fm quickly and install Last.fm Scrobbler for Raspberry Pi!

last.json

Wednesday, 15 August 2012
by sven
filed under Code and Announcements
Comments: 3

Our latest offering of open source software from the Last.fm headquarters is last.json, a JSON library for C++, that you can now find on GitHub. If you are coding in C++, need to work with JSON data and haven’t found a library that you like, do check it out.

We at Last.fm benefit a lot from open source software. Almost all our servers run Linux, the main database system runs PostgreSQL, and our big data framework for data analysis is based on Hadoop, just to name a few examples. Of course, not the entirety of all software needed to run Last.fm is freely available. We have had to write lots of code ourselves. When a building block is missing in the open source software universe that we have to carve ourselves, and we think our solution is good and is general enough to be useful for other people, we like to contribute back to the community and release it as free and open source.

JSON has become hugely popular as a format for data exchange in the past few years. The name JSON stands for “JavaScript Object Notation”, and it is really just the subset of the programming language JavaScript’s syntax that is needed to describe data. A valid bit of JSON is either a number (say, 12 or -5.3), a truth value (true or false) a string literal (“hello world!”), the special value null (a placeholder for missing or unassigned data), or one of the following two: lists of JSON values and mappings of property names to JSON values. These last two data types allow to actually express almost any data using JSON. A list could be [1,2,3] or [99, “bottles or beer”]. It is literally a list of data elements, which can be of identical type (like the all numbers list in the first example), or different types (like a number and some text in the second example). You can add structure to your data using mappings: { “object”: “bottle of beer”, “quantity”: 99 }. A mapping is basically a set of key-value pairs, where the key is a bit of text (“object” and “quantity” in the example) and the value can have the form of any of the JSON data types.

Now you know all the rules of JSON data. The reason why it is so ultimately versatile is that you can nest those data types. Any element of a list or any value in a mapping can be a list or a mapping itself. Or any of the other primitive data types. This is perfectly valid JSON:

{
  "artist": "White Denim",
  "similar artists": ["Field Music", "Unknown Mortal Orchestra", "Pond"],
  "toptracks": [
    { "title": "Street Joy", "scrobbles_last_week": 739 },
    { "title": "It's Him!", "scrobbles_last_week": 473 },
    { "title": "Darlene", "scrobbles_last_week": 386 }
  ]
}

You can imagine how this can be used to describe virtually any data structure. It is much simpler than XML and many other data formats. And the good thing is that not only computers are able to read JSON, humans are, too! As you can see in the example, not only can you read the data, you understand immediately what it is about. More often than not, JSON data is self-explanatory.

So, as I said before, JSON has become very popular for data exchange. It is a breeze to use in JavaScript (which is not surprising, because any JSON is also valid JavaScript) and many other programming languages like Python, Perl or Ruby. If you are familiar with any of these languages, you probably see that these languages have data types very similar to the JSON types, and it is therefore easy to represent and work with JSON data in those languages.

Unfortunately, less so in C++. C++ is strongly typed, which means that you always declare a variable with a specific type. It can be a number or a text string if you want, but you have to decide which one it shall be at the time you are writing your programme code. There are standard types for lists and mappings, too, but those require their data members to be of identical type. So you can have a list of numbers, or a list of strings, but not a list of items that could individually be a number or a string.

We use C++ for many of our backend data services, because it is fast and not resource hungry. If you have a good level of understanding, you can do great things in C++, and we love to use it for certain tasks. When we first wanted to use JSON for data exchange in our C++ programmes, we looked for a good library that makes it easy to juggle with JSON data, but we couldn’t find none that really satisfied our needs. So we spent some time writing our own library. And because we think it’s not too bad, and other people might have the same needs, we have now open sourced it under the MIT license, which basically means that you can use it freely in your own projects, but we refuse any liability for bugs or whatever could go wrong with it.

So, how do you work with JSON using last.json? The library defines a datatype lastjson::value which can hold any JSON data. You can check at runtime what data type it actually holds, and then convert it (or parts of it) to standard C++ types. The best practice, however, is to use it much like you would in those scripting languages I mentioned earlier: you just access elements of list or mappings as the data types you expect them to be. If the JSON data does not have the structure you assumed, the last.json library will throw an exception that you can catch. Imagine, you have a variable std::string json_data that contains the JSON fragment from the example above (the one about White Denim):

lastjson::value data = lastjson::parse(json_data);

This parses the json string into the lastjson::value representation. And these are a few things you can do with the parsed JSON data:

try
{
  std::cout
    << "Artist name: "
    << data["artist"].get_string()
    << std::endl
    << "Second similar artist: "
    << data["similar artists"][1].get_string()
    << std::endl
    << "Top track last week: "
    << data["toptracks"][0]["title"].get_string()
    << std::endl
    << "... with "
    << data["toptracks"][0]["scrobbles_last_week"].get_int()
    << " scrobbles."
    << std::endl;
}
catch (lastjson::json_error const & e)
{
  std::cerr
    << "Error processing JSON data: "
    << e.what()
    << std::endl;
}

last.json tries to make working with the JSON data as easy as in scripting languages. This was just an example, and last.json has many more cool features. So if C++ is your language of choice, go and check it out now.

balance.fm

Friday, 2 March 2012
by Marcus
filed under Code and Announcements
Comments: 6

The open source tool balance is an essential part of the service infrastructure here at Last.fm. Multiple instances of balance are running on each and every web server node, on the various production back end servers, and also on our development machines. So at any given time there are probably thousands of instances running simultaneously on our machines.

What does it do?

balance is a so-called load balancer. It is generally used as a proxy to distribute a large number of incoming requests to a group of servers. In other words it is responsible for balancing the load between all the servers in a group. Quite often, load balancers are dedicated hardware products. However, balance is a software load balancer, which means it can just run as an additional program on any server.

In addition to load balancing, balance also supports a scheme called failover. This means you can define a second group of servers and balance will route requests to the second group if all servers in the first group fail. This failover scheme is used by most of our backend services at Last.fm. We usually have a main server and a backup server that kicks in once the main server fails.

End of story?

Certainly not! There are some subtleties in the use of balance that have given us headaches in the past. By far the biggest problem is that there are cases when failover just doesn’t work right in our environment. So here’s a real example…

One day we had to take down the main server for one of our backend services to replace a hard drive. The backup server was running fine and we relied on balance to take care of routing all requests through to the backup box. Unfortunately, shortly after the main server went down, we noticed that most requests to the service failed.

What had happened? balance has a configurable connect timeout, i.e. it tries to connect to a service and then waits for a certain amount of time until it figures out that it can’t connect. If the server machine is running, the connect will fail almost instantly if the service itself is unavailable. However, if the server is down, it’ll wait until the connect timeout has elapsed. So in our case, balance was trying to connect to the main server (which was down) and then waiting for 5 seconds before attempting to connect to the backup server. In the meantime, the client had already given up (it was using a much smaller timeout). balance would only notice that the client had given up by the time it had established the connection to the backup server. The next time the client tried to connect, the same thing would happen all over again.

But someone else would certainly have had the same problem before?

I’m quite sure of that. And I guess that’s what caused the autodisable feature to be added to balance. When this feature is being used, balance will automatically disable servers that it fails to connect to. The downside, though, is that there’s no way to automatically enable servers again. And manually enabling them isn’t really an option given the number of instances of balance we’re running and given that it could cause all servers to be permanently disabled in case of, for example, temporary network failure.

So what now?

We had to face the fact that in theory we had a really nice redundancy scheme, but it could fail quite miserably in practice. So I began to look around for alternatives to balance and found a couple of other open source load balancers. Sadly, all of them had either been abandoned by their authors, failed to build out of the box or just didn’t fulfill our requirements.

balance was actually just what we needed. The only thing it was missing was support for monitoring all back end connections and dynamically disabling and enabling them as they fail or pass the monitoring checks.

So eventually I started looking into adding exactly that functionality to balance.

balance.fm

Implementing monitoring for balance was relatively straightforward, even though it made me aware of how much I had gotten used to developing software in C++. With balance being written in pure C, I was really missing exception handling and the C++ standard library.

The amount of code changes was massive considering the rather small code base of balance. As of now, more than a thousand lines of code have changed and another thousand lines have been added. So we decided to fork the original project and rebrand it as balance.fm.

It took about a week to refactor the existing code and finally add the monitoring feature. Along the way of adding monitoring, quite a few bugs have been fixed as well (for details, just have a look at the commit log if you’re interested) and I hope these fixes make up for all the bugs that I’ve undoubtedly introduced by adding loads of new code.

The balance.fm code has since been reviewed by the MIR team here at Last.fm and is available from github.com/lastfm/balance.fm.

If you have an application for balance.fm, please give it a try and let us know what you think and like or dislike about it!

Music Hack Day Berlin

Friday, 3 June 2011
by
filed under Code and Stuff Other People Made
Comments: 5

Last weekend, Russ Hall and I travelled to Berlin for the latest in the series of Music Hack Day events. Having been to previous ones in Stockholm and London, I was looking forward to the usual mix of creativity and collaboration that these events are famous for.

For those that don’t know, Music Hack Day is a chance for programmers, designers, artists, etc, to get together and create new and exciting music hacks based on the latest APIs from top music tech companies (or they can just use a soldering iron and some knitting needles). It’s always amazing to see what a room full of talented people can come up with in just 24 hours and this one was no different.

APIs

First up at a Music Hack Day are the API presentations. This is where music tech companies pitch their APIs and it’s also an opportunity to announce new features for devs to get stuck into straight away. We were there to present our API and the addition of our beta realtime API.

Hacking

After this, everyone fuelled up on Club-Mate and set about hacking. I played with a few side projects while Russ joined up with Tim Bormans of Soundcloud and Jens Nikolaus (designer of the much sought after Music Hack Day Berlin tote bag with Kristina Schneider) to create Sleev.in, an album obsession sharing site.

Here’s Tim and Russ mid-hack.

The amount of sleep you get over the Saturday night depends on how well your hack goes and/or how ambitious you were and at around 2pm on the Sunday, around 24 hours after you’d have started, it’s time to put down your laptop and present what you’ve done.

Hack demos

The hack demos are always fun to attend and see what everyone else has been making.

Some notable Last.fm related hacks were Lastcred.fm which would, to make your Last.fm profile look trendier, scrobble several random tracks for you based on a tag or artist of your choice (not something we really approve of, but it amused us nonetheless), Tractor which pulled in Last.fm data to help with displaying artist info for mentions of an artist on a web page, and RealTimeSentiTweetGagasm which used our realtime API.

Other favourites of mine were Heavy Shoes which used an Xbox Kinect sensor to detect foot stomps and then play drum noises (something I want to be doing at home very soon), and Eigendrums which also triggered drum sounds, but this time by detecting the sound of you clicking, slapping your leg, and thumping your chest. Both crowd pleasing impressive demos.

You can find the complete list of hacks here.

If any of this interested you then why not think about attending a Music Hack Day! The next one is in Barcelona in a few weeks, but they crop up fairly regularly.

And finally a big thanks to Roel van der Ven and Johan Uhle of Soundcloud for organising such a fantastic event.

Last.fm: now supporting tea breaks

Wednesday, 13 April 2011
by Dane
filed under Announcements and Code
Comments: 57

Hey, look! It’s a pause button! I know right?!

Pause is a feature that users and staff alike have been requesting for quite some time, so one dreary Thursday evening I decided ‘Enough is enough! We NEED pause and we need it now!’

That’s almost true anyway. It’s been a big challenge to implement, and we’ve spent a little while testing this feature — writing some supporting infrastructure and making sure the feature works well across our different players.

That’s right, the Android and iPhone client also now come with pause too. We’re working on a new version of the desktop client now, and that will come with pause too.

I’ve been getting a ton of use out of it and I hope you guys do too! The specs are now available for partner players (like XBox and Windows Phone 7) to support the feature, and we’ll be updating the FAQ as and when they’ve implemented it.

Drum Roll – It’s time for the main feature!

But… but but but, we’re also releasing something new and exciting into the wild today.

Whenever you tune into a radio station on Last.fm we build a playlist of tracks based on various criteria: for Recommended Radio we’re looking at music that you might like based on what you’ve been listening to recently; for Friends Radio we’re looking at what your friends have listened to recently… and so on and so forth.

Up until now we haven’t surfaced why a particular song is being played to you, but that’s about to change with a little feature that puts some info text in the top left of the player.

When you’re listening to Similar Artist Radio or your Library Radio we’ll show you some information about the track being played (the song selection is kinda obvious — it’s in your library, or it’s similar to the artist you typed in).

Things start to get a little more interesting when you’re tuned to Friends, Neighbours, Recommended or Mix radio. You’ll see information about which artists or users fed into the song selection. If you click the “more” link you’ll scroll down to where there’s a little more detailed information; maybe it’s a few of your friends or a few artists that inspired the selection.

(By the way, if you’re using the Festive cheer or Bah! Humbug! radio settings then you’ll get a reduced amount of information. If you want to experience the magic you’ll have to turn them off for now, sorry!)

Hope you enjoy them! Remember, you can always offer feedback about features like these on the forums, and if you want to join the team who made them just head to the Jobs page.

Artist Artist

Wednesday, 13 October 2010
by cms
filed under Found On Last.fm and Code
Comments: 32

Hello people. I’m cms, and my job here at Last.fm is looking after the databases. Much of the time I’m involved with operational running of database servers, designing and optimising SQL queries, and scaling work on our relational database clusters. Every now and then though, I do get an opportunity to poke around in the Last.fm dataset and explore some of the interesting relations.

I recently re-discovered the seminal album ‘Spirit Of Eden’ by ‘Talk Talk’ (haven’t tried it? You really should, it’s magical), and I’d been giving it quite heavy rotation. This prompted a comment on my profile by one of our lovely users, who suggested making a playlist from artists whose names consisted of repeating word patterns. This idea appealed to me, but off the top of my head I could only come up with a paltry half-dozen candidates. Surely there were many, many more. If only there was some kind of database nearby I could query…

We keep our main catalogue data in a PostgreSQL database. PostgreSQL has a nice set of extended string operators, including quite comprehensive regular expressions support, which would be useful for an ad-hoc query like this.

Here’s what I came up with initially off the top of my head

select name from artist where name ~* E'^(\\w+\\M)\\s+\\y\\1$' ;

Using the case insensitive regular expression match operator ~* and matching against a string that begins with a sequence of word characters leading up to a word boundary, which I’m capturing as a group, then a sequence of whitespace, then the start of a word boundary followed by the original captured match.

This query worked really well at defining the pattern for repeating names. I was matching well over 10,000 distinct strings. The problem was that we store all the submitted data for artists, and this includes data from a broad range of unverifiable sources. I was getting lots of great artist names in my set, but many of them were bogus; typos, mis-taggings, spelling corrections, and that was just the obvious mistakes.

I needed to come up with a way of filtering the set further. My first iteration was to use track information. Incorrect artist attributions seemed unlikely to have relations over tracks in the catalogue, and I could extend my query relatively easily to take account of prolificness like so.

select count(1), a.name from artist a, track t where a.name ~* E'^(\\w+\\M)\\s+\\y\\1$' and t.artist = a.id group by 2 order by 1 desc;

This got me a shorter set of artists (8000 odd), with some ordering. I could see that recognisable artist names (hello Duran Duran !) were sorting towards the top. However, ordering by catalogue volume still wasn’t quite right. Ideally I needed some kind of popularity weighting. Unfortunately we don’t store any scrobble data in the PostgreSQL catalogue schemas.

However we do store scrobbles, alongside exported catalogue information in our Hadoop cluster. Although I have been known to write Java code in the past, I’m mildly allergic to it. Luckily for me we have a Hive interface to Hadoop. Hive offers an interactive query language over Hadoop that is closely modelled on SQL. The only stumbling block remaining was porting my regular expression over to use Java syntax.

Here’s what I ended up with as a hive query:

select meta_artist.name, overallplayreach_artist.reach from meta_artist join overallplayreach_artist on meta_artist.id = overallplayreach_artist.id where meta_artist.name RLIKE '^(.+?\\b)\\s+\\b\\1$' and meta_artist.correctid IS NULL and overallplayreach_artist.reach > 50 order by overallplayreach_artist.reach desc ;

Joining against some “playreach” data to give a weighting according to rough popularity. My original SQL query took 17 minutes to run, on a fairly beefy database server. The hive query took less than 100 seconds to return, running across the entire Hadoop cluster. Awesome.

Without any further ado, here’s the top 10 results, roughly ordered by artist popularity.

Artists with repeating name patterns
Duran Duran
Frou Frou
Gus Gus
Talk Talk
Xiu Xiu
The The
Man Man
Cash Cash
Danger Danger
Gudda Gudda

I’ve created a tag artistartist, and tagged some of the entries already.

The full list is available here. There might well still be some rough data in there, I haven’t particularly sanity checked it by eye.

If you too would like the chance to play with Last.fm’s vast amounts of data and join our team, check out our job openings.

But does it scrobble?

Friday, 19 March 2010
by adrian
filed under Code and Announcements
Comments: 33

It feels like just the other week that I posted this on the Last.fm developer forum to get feedback and ideas on a new version of our scrobbling API that we were mulling over.

For those less technically-inclined, the scrobbling API defines how data gets transmitted to Last.fm every time you listen to a song. Scrobbles are incredibly important to us. They’re the building blocks of your music profile, and put together, they power basically everything that Last.fm knows about music.

The current API does a decent enough job, but we’ve had many developers complain about it being inconsistent with the other ways of accessing Last.fm data (via our web services) and its rather shoddy feedback on errors.

We also wanted to make the API more extensible so that we could define certain information which must always be submitted (like track and artist name) while allowing us to provide extra functionality in future via optional fields that wouldn’t break existing scrobblers.

600 ways to scrobble

Our scrobbling servers get a lot of traffic – at certain times of the day we have nearly 800 people telling us what they are listening to every second, and we are nearing our 40 billionth scrobble! There are also many different ways to scrobble the music you’re hearing, some developed by us (such as our official Last.fm, Android, and iPhone apps) as well as applications developed by third parties and music-loving geeks from all over the world.

Scrobbles-per-second monitor in the Last.fm operations room, powered by CactiView.

All told we have more than 600 scrobblers created by people other than us, covering popular online services like Spotify and The Hype Machine, hardware devices like the Onkyo TX-NR807 and the Logitech Squeezebox, as well as online storage services like Bitspace, extensions for browsers like Chrome (via Chrome Scrobbler), Opera (via Seesu) and Firefox (via FoxyTunes), and finally, for the real geeks, plugins for Gnome’s totem player and a promising-looking fork of Amarok called Clementine, to name just a few. A fair share of all existing scrobblers is listed on build.last.fm – browse around if you’re curious!

Preparing a new version of the scrobbling API

Given the heavy use of the current scrobbling API, releasing a new version of it is not something we take lightly – which is why it’s taken more than a year to get to where we are today. My post back in January 2009 generated pages of suggestions, plenty of e-mail conversations with developers and led to many hours of internal discussions and arguments involving nearly everyone in the company in some way.

We are finally able to unveil our first draft of what the new API might look like. Please bear in mind that this is not complete or final; we’re releasing it as a “request for comment” from the developer and user community. All the technical details can be read on our forum here and we’d like to keep detailed discussion there. We’ll be monitoring the post and taking feedback onboard.

Here’s a summary of just some of the highlights planned for the new API:

  • The scrobbling API will become a fully-fledged member of the Last.fm Web Services under a new “Scrobble” package joining its friends Track.love and Track.ban instead of being all sad and lonely on the sidelines. This should simplify things for developers by having one unified authentication, request and response mechanism. We also hope that this will lead to applications which currently just scrobble to use the rest of our API and vice-versa, with the end result being cooler apps with more features for everyone.
  • Migrating to the web services will improve our ability to track the use of scrobble applications, so we can do groovy things like charts of the most popular scrobblers, and analyses of musical tastes across different scrobblers. Yes, we will finally be able to answer the burning question – “Do Amarok users have better taste than XBox Live users?” We hope that our scrobbling partners and their users will be able to do cool things with this data.
  • Corrections information will be returned where relevant so users can be prompted to fix any incorrect metadata they may have.
  • Changes to Last.fm radio scrobbling will allow us to improve our recommendations. We’ll get more specific listener feedback because loves, bans and skips can be tied to a specific radio stream, not just to a particular track.
  • We’ll return more detailed error messages which should simplify the process of developing a scrobbler.
  • Third party developers will be able to upload their own icons which will show up on a Last.fm user’s profile when they are listening with a particular scrobbler. We currently provide this as a service for our most popular scrobblers but will extend this to all third party apps (this was our most requested feature after improved error logging!).

There were a lot of great ideas which didn’t make the cut, but the new API should allow us to add new features more easily and we plan to expand on this release in the future. After a round of feedback from the community we hope to put a beta version of the API out for testing and will then work towards finalising it a month or two after that. Third party developers will then be able to start updating their existing applications (or writing new ones) and passing the benefits of the new features on to you, our faithful users.

We’re hoping that by making scrobbling development easier we will be taking more steps towards getting every musical device on the planet scrobbling. Let us know what you think.

Hacking in Stockholm

Wednesday, 3 February 2010
by flaneur
filed under Code and Stuff Other People Made
Comments: 8

Last weekend I was lucky enough to voyage to Stockholm with Jonty and Michael to represent Team Last.fm at Music Hack Day.

Music Hack Day’s premise is simple – find the best and brightest tech and music geeks, get them all together for a weekend, mix in APIs and workshops from every online music service worth its salt, and then spend 24 hours making… well, anything!

Started by Soundcloud’s Dave Haynes here in London last July, subsequent Music Hack Days in Berlin and Boston have cemented their reputation as the best tech events going, music or no. (Anthony from the Hype Machine did a nice write-up on some of the ingredients that make them great.)

So it was with some excitement that we boarded our plane on Friday and headed north. In addition to our standard hack day paraphernalia — laptops, check; giant headphones, check; world’s tiniest Guitar Hero, check — we also carted along some limited edition stickers, newspapers, and a short presentation on the venerable Last.fm API. (You can grab those slides here as a PDF download.)

Photo by Brian Whitman.

Stockholm certainly didn’t disappoint — the weekend was awesome! We came, we hacked, we even conquered.

We also learnt a lot. Some highlights included…

Swedish hospitality

“Hospitality” isn’t generally up there on the list of familiar Swedish traits (unlike, say, tasteful flat-pack furniture, or expensive booze). But our hosts — Henrik and Mattias — made everyone feel welcome and created an environment that let everyone just get on with creating cool stuff.

There were some uniquely Swedish touches too, like the delicious bread and cheese breakfasts and the snow-based beer fridge. Oh, and the Batmobile showed up. No, really.

APIs in the mirror

Though we’ve offered public APIs to developers since 2003, nothing makes you see them in a new light like face-to-face interaction with people trying to make clever and unusual things with them. We’ve come back to London with a long list of suggested improvements, things that could be clearer in the docs, and even a couple of bug fixes that were reported by intrepid Stockholm hackers. Thanks to everyone who spoke to us!

We also handed out free subscriptions to everyone who demo’d a hack that used the Last.fm API.

A few of our favourites:

  • My City vs. Your City Uses our new geo.getMetro* city charts API to compare top artists across hundreds of cities worldwide. Neat!
  • SimilarArtists A simple way to generate Spotify playlists of recommended music based on Last.fm similar artists.
  • Holodeck An attractive way to create an artist website that based on content from SoundCloud, Last.fm, Songkick, and Tumblr.
  • Mashboard A dashboard for your Soundcloud tracks that pulls in rich audio metadata from Echo Nest. And it scrobbles!

We also managed to sneak in a few hacks of our own:

  • HacKey Ever wondered what your favourite key is? Thanks to the Last.fm and Echo Nest APIs, now you can find out.
  • ProximRadio and Blobble Jonty and Michael came up with a deadly trio of new tech that enables a long-standing dream: proximity-based multi-profile radio stations, complete with group scrobbling. Whoa.

A complete list of hacks is available here.

The online music ecosystem = crazy delicious

It was humbling to be in the presence of so many talented companies and developers, from the music mad scientists of The Echo Nest to the streaming wizards of Spotify, not to mention entire teams who travelled to Stockholm from Songkick, Soundcloud, and many others.

It’s pretty clear that 2010 is going to be an exciting year in music and tech. (And not just because people are building Playdar-enabled beatmatched collaborative Spotify playlist generators that scrobble via robot arm attachments…although that helps.) Team Last.fm will be in attendance the next Music Hack Day and also at some events of our own, so stay tuned.

Until then, happy hacking!