What is Microsoft Fabric Data Warehouse?

What is Microsoft Fabric Data Warehouse?

Microsoft Fabric, an end-to-end Saas cloud-based analytics platform, has many workloads and objects. One of the Objects is called Data Warehouse, or in short, Warehouse. In this article, I’ll explain what the Data Warehouse in Microsoft Fabric is and how it works, and I will show you an example of it in action.

Video

Microsoft Fabric

Microsoft Fabric is an end-to-end Saas cloud-based analytics platform from Microsoft. This platform uses multiple workloads to deliver a complete analytics solution. Workloads in Fabric include data ingestion, data warehousing, data engineering, etc. To learn more about Fabric, I suggest reading the below articles;

Data Warehouse

Microsoft Fabric Data Warehouse is a database system that stores data in OneLake and provides a medium to interact with the database using SQL commands. The Fabric Data Warehouse, which is also called Data Warehouse, or in short, Warehouse, also provides a powerful computing engine behind the scene to account for large volumes of data and support a fast-performing database system. The term Data Warehouse comes from the fact that this is not usually a place to store transactional data for an operational system (for that, you can use Azure SQL Database). A Data Warehouse, in generic Business Intelligence terminology, is a place where you would store the data that needs to be analyzed.

In short, Microsoft Fabric Data Warehouse is a fast-performing, fully SQL-enabled data analytics engine that stores data in OneLake.

This definition of Data Warehouse would perhaps make you wonder, what is the difference between the Data Warehouse and Lakehouse then? There are a few differences, and the main important ones are; Lakehouse supports both structured and unstructured data, whereas the Data Warehouse only supports structured data. The Lakehouse supports SQL commands only to read data, whereas the Data Warehouse supports it for read and write operations. This would require a full article to explain, but I hope this brief explanation helps you differentiate between these two objects of Microsoft Fabric.

Getting Started with Warehouse

To start creating a Warehouse, first, you need to enable Fabric, which I explained here. Then you can go to the Fabric portal and select Data Warehousing workload.

Navigate to the Data Warehousing workload of Microsoft Fabric.

Once in the Data Warehouse workload, click on Creating a new Warehouse.

Creating a new Data Warehouse in Microsoft Fabric

You notice that the screenshot has Synapse Data Warehouse in the header. That is because Azure Synapse is the component behind the scene responsible for the Data Warehousing workload in Fabric.

After setting a name for the Warehouse, you will be navigated to the Warehouse Explorer. Here you can choose a method to load data into the Warehouse.

Warehouse Explorer

You can load the data into the Warehouse using Data Pipelines or writing SQL code. There is also an option to use a sample database, which I am using here. The sample data will have some tables preloaded and ready for work.

Sample data loaded into the Warehouse.

You can also use other methods to load the data. The screenshot below is an example of the copy data activity running from a Data Factory Pipeline.

Loading data into the Warehouse using Data Factory Pipeline and the Copy data activity

In the Warehouse Explorer, you can explore the data and create database objects such as Stored Procedures, functions, and views. You can also specify security specifications.

The Warehouse explorer

However, unlike Lakehouse, you cannot explore files. Because in Warehouse, you can only store structured data (Structured data means data tables).

How to load data into Warehouse

There are multiple methods to load data into Warehouse; the most common methods are;

The SQL commands you can use for the Warehouse can be used for read and write operations. Here is another difference between the SQL Endpoint of Lakehouse and the Warehouse; The SQL Endpoint in the Lakehouse only allows read operations, whereas the Warehouse allows read and write operations.

This means you can write SQL code that inserts, updates or deletes data in the Warehouse.

Lakehouse SQL Endpoint only supports read-only operations, whereas Warehouse supports read-and-write operations.
Source of the image: https://learn.microsoft.com/en-us/fabric/data-warehouse/data-warehousing

This means you can have SQL codes like the below to load data into tables;

INSERT INTO [dbo].[table2]
SELECT * FROM [dbo].[table1] 
WHERE [updated] > '2023-05-31';

Reading Data from Warehouse

There are also multiple methods to read data from the Warehouse. Here are some of those;

  • SQL query (run either in the Warehouse Explorer or in other tools such as SQL Server Management Studio)
  • Visual query (Power Query Editor online)
  • Power BI Get Data (Or Dataflow Get Data etc.)

You can write SQL queries simply inside the Warehouse Explorer like below;

Querying data using SQL Query from the Warehouse

You can also use Visual Query, which will use the Power Query Editor online to build queries.

Creating Visual Query for the Microsoft Fabric Warehouse

It is important to note that you won’t have Power Query’s full functionality in this editor. This would support mainly actions that can be folded back to the source. You can even View the SQL code generated by the visual query.

You can also use the SQL connection string (which you can acquire from the settings of the Warehouse Explorer) to connect tools such as SSMS to the Warehouse.

Getting the SQL connection string for the Microsoft Fabric Warehouse

Lastly, you can use Power BI to get data from Warehouse. One thing to notice here is that, from the Power BI point of view, there will be different ways to get data from the Warehouse;

  • Power BI report connects live to the dataset associated with the Warehouse (this dataset is automatically generated but can be edited)
  • Power BI reports connecting DirectQuery or Import Data to the Warehouse itself.

The Warehouse comes with its own Power BI Dataset. This dataset is automatically generated with the Warehouse.

The auto-generated Power BI Dataset for the Warehouse

This Power BI Dataset can be modified using the Warehouse Explorer in the Model tab.

Changing the Power BI Dataset online

Power BI Desktop can then connect to this dataset using Get Data from Warehouse or Get Data from Dataset.

Get data from Warehouses in the Power BI Desktop.

If you get data from the Warehouse, you will create a live connection to the associated Power BI Dataset. This is similar to the way that Power BI Datamart and Lakehouse work.

However, if you connect to the SQL endpoint, you can choose to have a DirectQuery connection to the Warehouse or import that data into your new Power BI report and dataset.

Get data from the Warehouse options in the Power BI Desktop

From the Power BI report’s point of view, it won’t make any difference if you connect to Lakehouse or Warehouse; you would have the same outcome if you are connecting to the auto-generated dataset associated with the Lakehouse or Warehouse.

You can also build the report directly from the Warehouse Explorer online.

Building Power BI report online

Summary

Microsoft Fabric Data Warehouse, or Warehouse, is a high-performant structured database system that stores data in OneLake. Microsoft Fabric Warehouse will enable high performance with high computing power behind the scene. The difficulties of scaling up and down are hidden from your side. All you see is a good-performing model. The Warehouse can be loaded with data using Data Pipeline, Dataflows Gen2, or SQL commands, and the data can be read using SQL endpoint’s query commands and tools such as Power BI. A Power BI dataset will be associated with the Warehouse, which can be edited online.

At this stage, you might ask what the difference is between Warehouse, Lakehouse, and Power BI Datamarts. Of course, there are many similarities between these three, and there are points available in all of them, but there are differences too. I will explain that in another article and video, as this one is already long. Stay tuned for the comparison article.

To learn more about Fabric, I suggest reading the below articles;

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

Leave a Reply