Where to Start Microsoft Fabric as a Power BI Developer

One of the questions I often get these days is what are the benefits of Microsoft Fabric for Power BI users and developers. Questions like: “If I am not a data engineer or database developer, what are the use cases for Fabric for me?” or “What is in Fabric that I can leverage in my Power BI solution as of now, something that I cannot do merely with Power BI?”

Answering these questions brings us to the point of the direct impact of Microsoft Fabric for Power BI users. In this article, I will explain that and go through the components you can use from the Power BI point of view; these would be features and components you can use in your existing Power BI solution and take it to the next level.

Video

What is Microsoft Fabric?

Microsoft Fabric is a full suite of analytics as a service, which runs as a SAAS in one platform, under one licensing, and provides everything a data analytics project needs. Microsoft Fabric utilizes multiple workloads to achieve this: Data Integration, Data Engineering, Data Warehousing, Databases, Real-time Analytics, Business Intelligence, Data science, etc.

To learn more about Microsoft Fabric, read this article here.

Some of the challenges in Power BI implementations

Although the general feeling is that Power BI is good enough for all analytical needs, the reality is that to do a data analytics project end to end, Power BI is not enough. There are limitations to what it can do and achieve (like many other tools); here are some of the shortfalls of Power BI when used just by itself in an analytical project;

  • Implementing data warehousing concepts such as SCD Type 2
  • refreshing your semantic model after the dataflow refresh
  • copying data from a table if you want to re-use it
  • Import mode needs refresh, DirectQuery is slow
  • It is not easy to extend it for data engineers or data scientists
  • an alert system that can be customized based on the frequency of data changes or perform an action when something happens

Of course, there are workarounds to fix any of the abovementioned issues, but the good news is that you don’t need workarounds if you use Fabric. One of the immediate advantages of Fabric is that there is a solution for everything mentioned above.

Resolution for those challenges in Fabric

Because Microsoft Fabric comes with a suite of services and they all integrate in a meaningful way, it means you can resolve the issues mentioned previously using the below;

  • Warehouse or Lakehouse for SCD Type 2
  • Data Pipeline for orchestration of the process for refresh
  • Shortcut instead of copy
  • DirectLake mode doesn’t need a refresh but is near DirectQuery for freshness.
  • Data engineers and scientists can easily use Lakehouse, Warehouse, and other components.
  • a more powerful alert system using Data Activator

So now let’s find out what things you will benefit from straightaway in Fabric if you are a Power BI developer.

Wins for Power BI Developer in Fabric

In this section, I focus on things you can immediately benefit from using Fabric if you are a Power BI developer. I won’t talk about data science, for example, as there is a learning curve for it, and although it is persuasive, it will take a while for you to get your head around it and get a meaningful outcome for it. However, I will talk about using something like Data Pipeline, which you can benefit from in just a few minutes of setting up, and your Power BI implementation will improve.

Things that immediately improve your Power BI implementation in Fabric, in my opinion,n are;

  • Data Factory (Dataflow and Data Pipeline)
  • Warehouse or Lakehouse
  • Shortcut
  • DirectLake

Now, let’s see what each of these are and how they can be helpful.

Dataflow: ETL as a separate layer

Dataflow isn’t a new concept in Power BI. Some years have passed since my very first blog article about Dataflow in Power BI. Dataflows give you benefits by separating the ETL component of Power BI so that you can leverage the work done in Dataflow in more than one Power BI solution.

Dataflows in Power BI (Gen1 dataflows) export the data into CSV files stored in ADLS Gen2, which is not necessarily the best way to keep structured data. Dataflows in Microsoft Fabric (Dataflow Gen2) can export data into Lakehouse, Warehouse, Azure SQL database, or KQL Database. These are much more structured data storage places capable of using T-SQL. If you want to learn how to upgrade your Gen1 dataflows to Gen2, read my article here.

More features will come in Dataflow gen2 in the future, and it makes sense to move to this new edition. Dataflow uses the Power Query engine behind the scene and is one of the components of Data Factory in Fabric.

Data Pipeline: Orchestration, Control Flow Execution

Suppose you have multiple dataflows, one for the dimension tables in the warehouse and one for the fact tables, and you want to run the latter only if the first one is executed successfully. After they both execute correctly, you want to refresh the Power BI semantic model. Data Pipeline is a component in Data Factory that enables you to do a scenario like that. In the Data Pipeline, you can set the following activity to run based on the output state of another activity.

This is a simple-to-use tool, which you can achieve wonders just by utilizing this structure, as you can see in the above screenshot; based on the outcome of the execution of Dataflow1, I can choose to run the next activity, which can be another Dataflow, or refreshing a semantic model, or many other activities from the list of activities there;

Read more about Data Pipeline in my article below;

The combination of Data Pipeline and Dataflow will give you a full ETL implementation, which immediately enhances the Power BI implementation.

Warehouse or Lakehouse: Persistent Structured Dimensional Model Storage

There is no need to say that a structured database system will be helpful for a Power BI solution. Although Power BI stores data in the xVelocity engine (file format and then loaded into memory for analysis), implementing things such as Slowly Changing Dimension or Late Arriving Fact tables or some of the more complex data warehousing requirements needs a structured data warehouse storage.

Lakehouse and Warehouse both offer database storage. The data will be stored in the format of tables, and the T-SQL querying language can be used to work with the data. Because the data is stored there, you can quickly implement any requirement for data warehousing, such as the SCD Type 2, Inferred dimension member, or anything else needed.

Read this article to learn more about Lakehouse and where to start for it;

Read this article to learn more about Warehouse;

There are differences between Lakehouse and Warehouse that I explained in the article below;

Adding a Lakehouse or Warehouse to your Power BI implementation is one of your best wins. If there is only one recommendation I want you to use, it is Lakehouse or Warehouse for your Power BI solution.

Shortcut

Often, you want to reuse a table you have created in another model. This happens a lot with common tables such as Date, Time, Currency, Account, Customer, or most of the dimensions. Although you can get data from it using Dataflow, it still copies the data into Power BI and duplicates it instead of keeping it in one place as a single version of truth.

Shortcut is a straightforward yet effective feature in OneLake, which is incorporated in Lakehouse for this purpose; instead of copying the data, using Shortcut, you access another table as if it is already in this Lakehouse. The Shortcut is just a link to that table, not a copy.

This feature also shows how effective Lakehouse can be for the reusability of your Power BI implementation. Read more about shortcuts in my article below;

DirectLake: Cutting Edge Technology

If you Import the data into Power BI, Although you gain performance, you will need to schedule the refresh of the semantic model to get the data up-to-date. If you use DirectQuery, you won’t need to refresh the semantic model, but sending the queries to the data source and waiting for the results to come back kills the performance of the Power BI solution.

DirectLake is the new technology feature added to Power BI. This happens when Power BI connects to the data stored in Delta Lake format in Lakehouse or Warehouse (and Database). When Power BI connects to those, it considers the parquet files as its proprietary file formats and won’t re-create the file structure; this means you can have the fresh data without needing to import it, and still, it performs at a swift pace similar to Import, in short; DirectLake is like DirectQuery with the performance of Import Data.

Direct Lake connections are the newest type of connection in Power BI. To learn more about it, read my article here:

Although Direct Lake semantic models won’t need refreshing, there is a REFRAME process happening, to learn more about it, read my article here:

Summary

Each part of Microsoft Fabric is helpful in Power BI implementation; Data Science, real-time analytics, and Data Activator are beneficial. However, what I mentioned in this article won’t need to expand your requirement in Power BI; it is just taking your current Power BI implementation and making it better organized, and being able to maintain it easier using Fabric features such as Data Pipeline, Data Warehouse or Lakehouse, Shortcut, and Direct Lake.

I teach Microsoft Fabric to Power BI Developers often, and I provide consulting to help them with their day-to-day Microsoft Power BI and Fabric challenges; if you want to learn Fabric and Power BI in a structured way, check out our upcoming training sessions here.

Here are some of the helpful links related to this article;

Reza Rad on FacebookReza Rad on LinkedinReza Rad on TwitterReza Rad on Youtube
Reza Rad
Trainer, Consultant, Mentor
Reza Rad is a Microsoft Regional Director, an Author, Trainer, Speaker and Consultant. He has a BSc in Computer engineering; he has more than 20 years’ experience in data analysis, BI, databases, programming, and development mostly on Microsoft technologies. He is a Microsoft Data Platform MVP for 12 continuous years (from 2011 till now) for his dedication in Microsoft BI. Reza is an active blogger and co-founder of RADACAD. Reza is also co-founder and co-organizer of Difinity conference in New Zealand, Power BI Summit, and Data Insight Summit.
Reza is author of more than 14 books on Microsoft Business Intelligence, most of these books are published under Power BI category. Among these are books such as Power BI DAX Simplified, Pro Power BI Architecture, Power BI from Rookie to Rock Star, Power Query books series, Row-Level Security in Power BI and etc.
He is an International Speaker in Microsoft Ignite, Microsoft Business Applications Summit, Data Insight Summit, PASS Summit, SQL Saturday and SQL user groups. And He is a Microsoft Certified Trainer.
Reza’s passion is to help you find the best data solution, he is Data enthusiast.
His articles on different aspects of technologies, especially on MS BI, can be found on his blog: https://radacad.com/blog.

Leave a Reply