A recent talk by Kenneth Cukier at TED highlights what’s good — and some of what’s bad — about big data.
What is is about “big data” that’s really new? Cukier makes three points. Firstly, we’ve gone from having a small amount of data and trying to understand the world from it, to a huge amount of data — more than we ever thought possible — that needs new theories and tools to interpret it. Secondly, data has gone from being a store to being a flow: something that passes by in real time, often resistant to collection and off0line analysis. Thirdly — and perhaps of most general importance — we’ve taken things that have always been informational and rendered them into information bases that are persistent and searchable. (Location is the prime example: something that conveys a huge amount of contextual information, now being stored and re-used at scale.)
Cukier highlights the importance of machine learning over big data. Rather than explain the nature of a problem to a computer, with sufficient data and some care in re-framing the problem we can let the computers work out the solution of the problem for themselves. This is something we need to understand more about when applied to science and humanities, as well as to engineering and commerce.
The talk is on YouTube, about 15 minutes long.