OneLake to Rule Them All

James Wood
Switched On: The Bowdark Blog
4 min readFeb 28, 2024

--

Ever since Microsoft introduced the world to Fabric in early 2023, there’s been a ton of buzz about OneLake. Customers frequently ask what it is, how it relates to the rest of Fabric and the Microsoft Intelligent Data Platform, how it impacts long-term enterprise architecture plans, and more.

Given the abundance of questions, I thought we’d take some time out to break down what OneLake is and why we think it’s a game changer in the data/analytics space.

What is OneLake?

Microsoft OneLake is a new “data lake-as-a-service” offering from Microsoft. According to Microsoft, OneLake is “a single, unified, logical data lake for your whole organization”.

Logically, OneLake lies at the center of Fabric — Microsoft’s new all-in-one analytics solution (see Figure 1). Whereas before customers may have maintained multiple data lakes, the “one” in “OneLake” implies that OneLake is positioned as the only data lake within your organization.

Figure 1: OneLake as the Foundation of Microsoft Fabric

Microsoft’s Goals with OneLake

From a marketing perspective, Microsoft is fond of saying that OneLake is like the OneDrive for your data (estate). As such, Microsoft is positioning OneLake as the center of gravity across your entire multi-cloud organization.

In other words, OneLake is designed to be:

  • A single place to land all of your data — both structured and unstructured.
  • A unified and logical data lake that breaks down data silos and makes it easier to blend and analyze data together.
  • A centralized repository with comprehensive services to simplify data discovery, governance, and security — enabling users and applications of all types to access the data they need.

Bringing Order Out of Chaos

With all the new branding around Fabric, one of the most common questions we get from customers is how does OneLake relate to existing services such as Azure Data Lake Storage and Azure Synapse Analytics. The answer here is a bit tricky and requires some context.

Before OneLake, companies were building data lakes by hand. For example, a customer might create an Azure Data Lake Storage Gen 2 (ADLS) storage account and then build a data lake using common data lake patterns. As you might expect, the results of these efforts were mixed.

Without enterprise-level buy-in and common standards, many of these data lake projects devolved into siloed, department-specific lakes. Beyond the obvious issue of data duplication, companies were also feeling the pain of overhead and (redundant) administration headaches.

Figure 2: Siloed Data Lakes with Lots of Duplication (Image from Microsoft Ignite Conference)

Thinking in cloud terms, Fabric/OneLake move you a level up in terms of abstraction. Instead of buying up a bunch of IaaS and/or PaaS-level services to build your data lake(s) by hand, OneLake offers an opinionated, SaaS-level service that comes with pre-equipped with many powerful features to organize and effectively administrate your organization’s data lake.

Underneath the hood, Fabric is built on top of tried-and-true Azure-based services such as ADLS and Synapse. This means that Fabric offers the same near-infinite horizontal scaling that spans the globe. The primary difference is that Fabric takes care of most of these issues for you so that you can focus less on data lake construction and more on driving value from your data estate.

For example, Figure 3 below shows how OneLake uses concepts like Fabric domains, workspaces, and shortcuts to logically organize your data lake and eliminate the need for redundant data copies.

Figure 3: Domains & Workspaces Concept within OneLake

It’s worth noting that Fabric’s thoughtfully opinionated design tackles more than just the technical stuff; it smartly sorts out practical business concerns too. By keeping compute and storage resources separate, Fabric provides flexible options that allow companies to manage their budgets more effectively. For example, IT has options to set quotas on resource usage by department, making it easier to keep tabs on costs and resource use. These types of features make it much easier for IT to push back on those department-level requests to create separate data lakes, etc.

Closing Thoughts

If your company doesn’t have a data lake or your data lake(s) have become data swamps, then OneLake offers a comprehensive and scalable solution that can provide a one-stop shop for data across the enterprise.

In the coming weeks, we’ll be expanding on these concepts further by looking at ways that Fabric streamlines access to OneLake for a variety of consumer types.

--

--