Oct 27

Big Data

I attended a meetup last night hosted by Chris Dixon an led by Roger Ehrenberg on the topic of big data. There was a lot of talk about algorithms, machine learning, and key value pairs, but as the evening wore on,  I became more convinced that these are tools and the big wins still come from understanding humans more than understanding machines

I pushed for an example of a consumer facing web service where the consumers experience as meaningfully improved through the use of “big data” techniques. The best answer was Google, but everyone quickly acknowledged that page rank was people powered. Yes, it is possible to do citation analysis at scale because we now have the horsepower and data structures but people provide the powerful insight. I also learned that the big wins in the Netflix algorithm challenge did not come from better algorithms, it came from better classification. The winners added “high brow” and “low brow” as categories of movies. Google language translation was another example but apparently they used humans to train the algorithm.

Someone asked about Shazam and Chris Wiggins of Columbia pointed out that it was a really “coarse” algorithm. In other words they radically simplified the problem before turning the computers loose.  

I came away thinking that the big breakthroughs will continue to be driven by human insight. Sophisticated data analysis will open up new opportunities for human insight but we will still need to put our wet brains to use to cover the last mile.

