Russell Davies
Russell Davies
A view from Russell Davies

Russell Davis: Big Data is about to change the way we measure the world

I bang on a lot about the importance of data. Not just the data we're used to - sample-sized stuff, polls, viewing habits deduced from diaries - but the Big Data we're soon going to have at our fingertips.

There are skills to be acquired here, from the engineering abilities to do something useful with the data, to the design expertise to make it comprehensible. It's hard to think about these things in the abstract, so I thought a couple of examples might be helpful. They are coming thick and fast at the moment - there's a lot of healthy number-crunching going on out there.

First example: The Munster University of Applied Sciences has apparently discovered that it can tell what television programme you're watching from the electricity usage data coming from your smart meter. This is only experimental right now, and there are all sorts of caveats about interference, but in principle the university reckons it can do it. It is pointing this out to illustrate the importance of tightening up privacy regulations, but it also illustrates how measurable modern life can be and that, if you've got lots of data, you can infer all sorts of interesting stuff from it. Before long, you might be buying media data from EDF.

Second example: You can now download The Million Song Dataset: "A freely available collection of audio features and metadata for a million contemporary popular music tracks." It's designed, among other things, "to encourage research on algorithms that scale to commercial sizes". If you look at the data provided as an example (in perhaps the world's most oblique instance of "rickrolling"), you will learn that Never Gonna Give You Up by Rick Astley is 211.69587 seconds long, that it has a BPM of 113.359, that it has four beats to the bar and that the Echo Nest music intelligence service has a confidence measure of 0.634 that that last measure is true. And there's a load of really obscure data too, including connections to all sorts of social and commercial information. You can imagine the stuff music-data scientists might be working on, trying to develop "algorithms that scale to commercial sizes". Connecting insights about types of music to types of success, providing recommendations or connections to music services, advising music businesses on whether something is going to be successful.

There's not a particular reason this has to begin or end with music. It has just been much measured, a little earlier than other measurable things. This level of analysis is possible in so many aspects of life, commerce and creativity. And, as we see from the first example, not just in obvious ways. Some of the ways we measure things right now are going to be overtaken very quickly. Is that scary or exciting? How good are your data people?