Data Lake :
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed.
A traditional hierarchical data warehouse stores data in files and folders ,a data lake use a flat architecture to store the data.When a data element store in the data lake its assigned with a unique identifier and it's information stores in a meta data .Theses information can be easily queried for respective requirement.
Basically the ground infrastructure is used for data storage is HADOOP technologies. Usually data load takes place from different sources to the Data Lake takes place using ELT tools (Scoop, Command Line , Scripts , Talend , Pentaho etc..) .Here the data load process takes very fast as the load takes place parallel and no schema check happens while loading data only schema check happens while read.
Data lake is a marketing term and uses to large set of data
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed.
A traditional hierarchical data warehouse stores data in files and folders ,a data lake use a flat architecture to store the data.When a data element store in the data lake its assigned with a unique identifier and it's information stores in a meta data .Theses information can be easily queried for respective requirement.
Basically the ground infrastructure is used for data storage is HADOOP technologies. Usually data load takes place from different sources to the Data Lake takes place using ELT tools (Scoop, Command Line , Scripts , Talend , Pentaho etc..) .Here the data load process takes very fast as the load takes place parallel and no schema check happens while loading data only schema check happens while read.
Data lake is a marketing term and uses to large set of data
![Picture](/uploads/4/3/1/5/43157771/4450955_orig.jpg)
Splice Machine Example : Using Splice Machine as a Operational DB (Photo Courtesy Splice Machine )