Date: Thursday, February 19, 2015
Location: San Jose Convention Center, 150 West San Carlos St, San Jose, CA 95113
Speakers: Daniel Eklund and Rick Stellwagen of Think Big, a TeraData Company
Schooling the Fish, Governing the Variety in your Data Lake
The advent of Hadoop ushered in a democratization of batch data processing for anyone with the skills to deploy commodity servers. Virtualization and PaaS have also extended this “democratization” to anyone with a credit card. While the parallel processing of MapReduce and newer computational models (on top of YARN) have given challenge to the shared nothing architectures of established vendors, the real game changer is the notion of a Data Lake - an enterprise “warehouse” of data of any form - from relational archives, to device data, to truly unstructured. This talk will discuss how variety and “schema on read” have become a fundamental axis in the world of enterprise data solutions. We will discuss architectures and challenges in our client engagements and review common elements that have proven successful in transforming the “Data Lake” into a viable game-changing paradigm.