Posts

Showing posts from June, 2020

Data Lake : Swamp and DataOps

Image
  Data Lake is getting popular in consumer and enterprise data strategy as it proposes a wide variety of ingestion, conformance, analytical, and visualization offerings.  As the interest and adoption of Data Lake grow across multiple sectors, best practices, potential pitfalls, and operationalization techniques are blended into solutions and products, More than often best practices are followed at their best during the initial days and lose focus after the post-implementation phases. Although a decade-old term now,  "Data Lake" can be quick-referenced with the ecology of a freshwater lake, where water (data) is often collected from small streams (e.g. batch logs, weblogs) to large rivers (e.g. unstructured data, images, videos).  Like the Littoral zone in any freshwater lake, a data lake has staging zone(s) where specific types of data are analyzed, filtered, and consumed. The photic zone is within eyesight and can host batch, ETL/ELT processes within the data lake i...