I’ve been thinking a bit about data lately, partly because it’s all over the news, from the NSA to Facebook&co to the “new” allegedly fact-based reporting of Vox.com and FiveThirtyEight.com, partly because I’ve been toying with some aspects of the subject in my Abnormality series of silly stories, and partly because it’s become a part of my day job. Some people seem to be concerned about the sheer amount of it, although a billion tons of dog shit is still basically dog shit. It takes some seriously sophisticated software to pull meaning out of the heaps, and it all depends on what the heaps contain in the first place. Some of these worries are just misplaced – massive amounts of data about oil changes are not going to cause trouble for anyone – and most of the uses of personal data can be broken down into three major categories: stealing your money outright, trying to get at your money through advertisting, or screwing you over because of political affiliations. The first and third of these are truly genuine concerns and it’s impossible to be genuinely secure from threats, even if you never use a computer or a smart phone. We are all in the pool and “they” can find out pretty much anything about us if “they” really want to.
There are people trying to use big data to make predictions about stuff, but it seems to be clear that predicting the future is always fairly futile no matter what information you have. Aside from spying, theft and marketing, I wonder if there is much genuine use at all for all the petabytes of data that are being generated and stored every day. I would like to think of something more fun than calculating what percentage of someone’s paintings feature one or more trees or what percentage of love songs actually contain the word “love”. I guess there’s a lot of “low-hanging fruit” like that. Data will be used for good and for evil, I suppose, but most of all it will be used merely for the hell of it.