Some 2,500 sets of data have been officially released by the government on to the internet. But what's the point of giving access to all these figures? Michael Blastland offers just a taste of what this information can do.
You search Google for "earthquakes". You've asked it to trawl not the web, but government databases - in this case the US geological survey - and it produces the results as an archive, with instant maps, ordered by date, with magnitude.
Or try "unemployment rate" or "population" followed by a US state or county. You will see recent estimates, then be offered an interactive chart that lets you add or remove different geographical areas.
This is the Google Public Data Project, one of many ideas capitalising on the increasing availability online of public data, of which data.gov.uk, launched yesterday, is the latest.
So, what do you want to know? No, go on, anything. That's what it feels like as you browse the increasing range of open data sources. Take the US site that was data.gov.uk's inspiration. It started last year but promises soon to bring 100,000 data sets. How about a just-released measure of the quantity of toxic chemicals emitted into the air each year? Coming right up, measured in lbs:
What chemicals are they talking about? Yep, got that. There's a selection.
That's a near 80% cut in 20 years - or so it seems. I'd like to check a few facts and definitions, but at first sight it looks quite a story. Interesting?
Maybe you don't want to know about toxic chemicals in the US. The point is that public bodies have gone confessional. They are increasingly ready to share what they know. Well, not quite all of it, but the oceans of data that once sat in archives are becoming steadily more available to anyone, to use any way they like: to produce fancy visualisations, use the information as power to seek change, improve their own choices, or just to satisfy their curiosity.
For example, how about the top 50 communities in Massachusetts, by snowmobile registrations?
As I write, I'm checking Many Eyes, a visualisation site where people take any piece of this torrent of data that interests them - or any piece of data from anywhere they choose - graph it this way and that, and post it for anyone to see. Some are better than others. Snowmobiles in Massachusetts is in one of the latest.
The snowmobile capital, if you really want to know, is the town of Otis, population 1,396, with 200, apparently. About one snowmobile for every seven people.
I can't vouch for this data, but the American data.gov site tells you if comparisons between years are safe, and so on. And it doesn't take long to find your way around.
The great advantage of this approach - where the data is free and easily accessible and left to others to exploit - is that whereas raw data looks unappealing, anyone can design tools to make it sing, mash it up with other data sources, like maps, to add extra layers of information.
Ito Labs' 2008 bicycle counts through the London
One arresting example in the UK is the use of traffic data by Ito Labs to produce graphics like this one (see right), which shows counts of bikes through the day, based on the UK traffic census. On the real thing, you can slide a timer in the top left to see how this changes through the day. See more
examples of traffic data or link to a demonstration video, here
Another application is the UK is School Guru. Put in your details and it calculates your child's chance of a place at your preferred school - last year, unfortunately. It's not a fortune-teller and doesn't guarantee a place this time, but it gives a steer.
The data on data.gov.uk has already been made public elsewhere. What's new is the attempt to put it one place and draw attention to it. So it is not perhaps a sudden bout of disclosure so much as an acknowledgement of a trend now well underway towards more open data, encouraged by organisations like the
Open Data Foundation
or the Guardian's
Free our Data campaign
The list of initiatives grows by the week. The Mayor of London is announcing a data store, Channel 4 has formed 4iP - the 4 Innovation for the Public (4iP) fund - and has been inviting bids for grants for people with good ideas for exploiting public information on digital platforms, including iPhone or Facebook. In the next few months, postcode and Ordnance Survey data are expected to become more freely available so that public data can be more easily and excitingly mapped by anyone who cares to try.
Other established media will be thinking hard about how to catch up with all this. But perhaps the biggest effect will be on the public, once used to thinking that data was either unobtainable or served up by big media, now helped to see that it can be easily, colourfully obtained, with scope for interaction, eventually at a high level of detail and relevance. If so, there'll be impatience for more.
Postcodes could be next
The biggest caveat is that almost all data, as Go Figure often has cause to demonstrate, needs handling with care. It is easy to use it to come to the wrong conclusions, to ignore its limitations, to be misled by samples, averages or outliers, to take but a few statistical snares. One of the most important tools for making good use of all this data will be statistical savvy.
Infinite as it might already feel, all this is but a small beginning. So although my own sampling of what's out there doesn't convince me that anyone is anywhere near exploiting the potential of the data that's now available, or could be, the game is on.