As many of you noticed, a few weeks ago we launched Last.fm on Xbox LIVE in the US and UK. It probably goes without saying that this project was a big operation for us, taking up a large part of the team’s time over the last few months. Now that the dust has settled, we thought we’d write a short series of blog posts about how we prepared for the launch and some of the tech changes we made to ensure that it all went smoothly.
0 Hour: Monitoring.
First up, let me introduce myself. My name is Laurie and I’ve been a Sysadmin here at Last.fm for almost two and a half years now. As well as doing the usual sysadmin tasks (turning things off and on again) I also look after our monitoring systems, including a healthy helping of Cacti, a truck of Nagios and a bucket-load of Ganglia. Some say I see mountains in graphs. Others say my graphs are infact whales. But however you look at it, I’m a strong believer in “if it moves, graph it”.
To help with our day-to-day monitoring we use four overhead screens in our operations room, with a frontend for Cacti (CactiView) and Nagios (Naglite2) that I put together. This works great for our small room, but we wanted something altogether more impressive — and more importantly, useful — for the Xbox launch.
At Last.HQ we’re big fans of impressive launches. Not a week goes by without us watching some kind of launch, be it the Large Hadron Collider, or one of the numerous NASA space launches.
We put a plan into action late on Monday evening (the night before launch), and it quickly turned into a “How many monitors can you fit into a room” game. In the end though, being able to see as many metrics as possible became useful.
So, ladies and gentlemen…
Welcome to the war room
Every spare 24” monitor in the office, two projectors, a few PCs and an awesome projector clock for a true “war room” style display (and to indicate food time).
Put it together and this is what you get:
Coupled with a quickly thrown together Last.fm style Nasa logo (courtesy our favourite designer), we were done. And this is where we spent 22 hours on the day of the launch, staring at the graphs, maps, alerts, twitter feeds.. you name it, we had it.
It was pretty exciting to sit and watch the graphs climb higher and higher, and watch the twists and turns as entire areas of the world woke up, went to work, came back from work (or school) and went to sleep. We had conference calls with Microsoft to make sure everything was running smoothly and share the latest exciting stats. (Half a million new users signed up to Last.fm through their Xbox consoles in the first 24 hours!)
As well as the more conventional style graphs, we also had some fun putting together some live numbers to keep up to speed on things in a more real time fashion. This was a simple combination of a shell script full of wizardry to get the raw number, then piped through the unix tools “figlet” (which makes “bubble art” from standard text) and “cowsay” (produces an ASCII version of a cow with a speech bubble saying whatever you please).
Looking after Last.fm on a daily basis is a fun task with plenty of interesting challenges. But when you’ve spent weeks of 12-hour days and working all weekend, it really pays to sit back in a room with all your co-workers (and good friends!) and watch people enjoy it. Your feedback has been overwhelming, and where would we have been without Twitter to tell us what you thought in real time?
Coming Next Time
We had to make several architectural changes to our systems to support this launch, from improved caching layers to modifying the layout of our entire network. Watch this space soon for the story of how SSDs saved Xbox…