In the previous article, you learned the basics of relationships, you learned why we need a relationship, and what is the filtering impact of it across multiple tables. In this article, you will learn about one of the most important properties of a relationship called Cardinality. Understanding what the meaning is of 1-1, 1-Many, Many-1 and Many-Many relationship is the purpose of this article. If you want to learn more about Power BI, read Power BI book from Rookie to Rock Star.
Download the Pubs.xlsx the dataset for examples of this article here.
Read the first part of the Power BI relationship series: Back to Basics: Power BI Relationship Demystified.
Relationships in Power BI
Power BI relationships give us the ability to have fields from multiple tables and filtering ability across multiple tables in the data model. Relationships are based on a field, which will connect two tables, and filter one based on the other (or vice versa depends on the direction). As an example, we can filter the Qty of the Sales table by the State in the Store table, as long as there is a relationship between Sales and Store table based on stor_id;
And the relationship between the tables is as below;
To learn more about the details of the relationships, and why we need that, read this article.
What is the Cardinality of the relationship?
When you create a relationship between two tables, you get two values, which can be 1 or * on the two ends of the relationship between two tables, called as Cardinality of the relationship.
The two values of 1 or * are saying that the field in that relationship has how many of that value per line in that table. Let’s check it with an example;
In the Stores table, we have one unique value per stor_id per row.
So if this field participates in one side of a relationship, then that side will take 1 as the Cardinality indicator, Which is called as ONE side of the relationship.
However, the stor_id in the Sales table is not unique per each data row in that table. We have multiple rows for each stor_id. Or let’s say; in each store, there are multiple sales transactions happening (which is normal of course);
So if the stor_id in the Sales table is part of a relationship, that side of the relationship will become the *, or what we call the “MANY” side of the relationship.
So based on what we know so far, If we create a relationship based on stor_id between the two tables of Sales and Stores here is the output:
This relationship can be read in two ways;
- One-to-Many (1-*) relationship from the Stores table to the Sales table
- Many-to-One (*-1) relationship from Sales table to Stores table
They are both the same of course, and they will look exactly like each other in the diagram view. Now that you know what the Cardinality is let’s check all different types of Cardinality.
Types of Cardinality
There are four types of cardinality, as below:
- 1-1: one-to-one
- 1-*: one-to-many
- *-1: many-to-one
- *-*: many-to-many
Let’s check each of these types one by one.
One-to-Many, or Many-to-One
This is the most common type of cardinality used in data models. This type of cardinality means one of the tables has unique values per each row for the relationship field, and the other one has multiple values. The example, that you have seen previously between the Stores and Sales table based on the stor_id, is a many-to-one or one-to-many relationship;
There are two ways of calling this relationship; One-to-Many or Many-to-One. Depends on what is the source and destination table.
For example, the configuration below means from the Sales table to the Stores table relationship is Many-to-One.
And below shows the relationship as One-to-Many from Stores table to the Sales table;
These two are both ending with creating the same relationship as below:
It means there is no difference in one-to-many or many-to-one, except the angle that you are reading that from. If you look at this from Stores table, you have a “one-to-many” relationship. If you look at this from the angle of Sales table, you have a “many-to-one” relationship. And they both are the same with no difference at all. so from now on in this article, whenever you read many-to-one, or one-to-many, you know that you can read it the other way around too.
For the rest of the article, I will be using terms of FACT and DIMENSION tables, which I will explain them separately in another article in details. In the meantime, I recommend reading this data preparation article. a short explanation of terms are as follows;
- Fact table: the table that has the numeric values which we want either in aggregated level or detailed output. fields from this table usually are used as the VALUE section of visuals in Power BI.
- Dimension table: the table that has descriptive information, used for slicing and dicing the data of the fact table. fields from this table often used as Slicers or Filters, or Axis of visuals in Power BI.
Many-to-One Relationship between Fact and Dimension tables
Many-to-one is a relationship commonly used between the fact table and dimension tables around it. The example above is between Sales (which is the fact table), and Stores (which is a dimension table). If we bring another table into the model: Titles (based on title_id in both tables: Sales and Titles). You can see the same many-to-one relationship pattern exists;
Many-to-One relationship between Dimension and Dimension tables
This type of relationship although is often used in many models, can be always the subject of investigating for better modeling. In the ideal data model, you should NOT have a relationship between two dimension tables directly. Let’s check that as an example;
Let’s say there is a different model from what you have seen so far in this example; Sales table, and a Product table, and two tables for category and subcategory information of the product as below;
As you can see in the above relationship diagram, all relationships are many-to-one. Which is fine. However, if you want to slice and dice the data of the fact table (SalesAmount for example), by a field from DimProductCategory table (ProductCategory name for example), it requires three relationships to be processed;
This will add consume some of the processing power, and it also brings a lot of other issues, such as the filtering from the “many” side of the relationship to the “one” side of it. This type of relationship is not recommended. I have dedicated a whole article to this discussion of why you need to combine dimension tables to avoid such a scenario. I strongly recommend to read it here.
A better model would be combining category and subcategory tables with the product and having one single many-to-one relationship from the fact table to the DimProduct table. More details in the link above.
A one-to-one relationship happens only in one scenario when you have unique values in both tables per column. An example of such scenario is when you have a Titles and a Titles Details table! They both have one row per title. So If I create a relationship it would look like this:
When you have a one-to-one relationship between two tables, they are a good candidate to be combined with each other using Merge in Power Query. Because both tables have the same number of rows most of the time, or even if one of them has fewer rows still considering the compression method of Power BI xVelocity engine, the memory consumption would be the same if you bring it in one table. So If you have a one-to-one relationship, then think about combining those tables seriously.
It would be better if we combine both tables above in one table which has everything about the Title in it.
Many-to-Many Relationship: Weak Relationship
You have multiple records for each value in the joining field between tables, for both tables. If you think about it for a second, you can see that this scenario happens when you have tables that are related to each other with no shared dimension in between! Let’s check one example; Let’s say I have a Fact Inventory table and a Fact Sales table. They both have multiple records per product. and I want to connect them together using the Product ID. this has to be a many-to-many relationship because there is no product ID field which has unique values in it.
what if you have more than one table with that scenario?
You will end up with creating a many-to-many relationship between every two tables, and then because it causes circular filtering in the relationship, you end up with an inactive relationship! There are tons of other issues with the many-to-many relationships, and most of the time, they are all result of not having a shared dimension in between. I dedicated a whole article about what is a shared dimension, and how to create it to avoid a modeling mistake like above, you can read it here.
The many-to-many relationship causes tons of issues, and that is why it is called a weak relationship too. Most of the time, it can be resolved with creating a shared dimension and creating one-to-many relationships from the shared dimension to the fact tables. AVOID this type of relationship in your model.
A better model for the above sample would be using shared dimensions as the diagram below;
The cardinality of the relationship means having unique or multiple instances per value for the joining field between two tables. The most common type of cardinality is one-to-many or many-to-one which happens between fact and dimension tables. However, you can find one-to-one relationships too. One-to-one relationships are a good candidate to be combined into one table. Sometimes, for some types of one-to-many relationships, it is better to combine tables as well to create a flattened dimension. The type of relationship that you should be avoiding is the many-to-many relationship which can be resolved with creating a shared dimension.
The relationship discussion is one of the most basic, however, fundamental concepts of modeling in Power BI. I strongly recommend you to read more about it in the articles below:
- Back to Basics: Power BI Relationships Demystified
- The cardinality of the Relationship (1-1, 1-Many, Many-1, Many-Many) – this article
- The direction of the Relationship
- Active or Inactive Relationship
- Relationship Based on Multiple Fields
Do you have a scenario in your relationships which was not covered in here and you don’t know how to solve it? let me know in the comments below.