Data Virtualization: The Key to a Successful Data Lake

2 min readMar 11, 2021

If you’ve decided to implement a data lake, you might want to keep Gartner’s assessment in mind, which is that about 80% of all data lake projects will actually fail. Obviously, you want to be in that 20% that succeed. But how do you get there?

Diving into Data Lakes

Roughly five years ago, all data lakes were based on Hadoop. Architects’ very first thought was about how to stand up the Hadoop cluster, and their very next was how best to load data into the lake. They never thought about how data was actually going to be consumed.

As a result, only the most technically minded individuals could access and use the data from these Hadoop-based data lakes. Inside the data lake were separate data silos, accessible to only the most sophisticated data analysts or data scientists. Obviously this did not provide an impressive return on investment. Companies would lay down $5 or $10 million to build out these data lakes, only to discover that only 5% of their users could get any benefit from them.

Recently, data lakes have moved to the cloud, and organizations have been expanding their scope, so they’re not just for the most technical users.

Read more in https://www.datavirtualizationblog.com. Originally published on March 11, 2021.

Data Virtualization: The Key to a Successful Data Lake

Diving into Data Lakes

Written by denodo