Differences between joins and data blending
Data blending simulates a traditional left join. The main difference between the two is when the join is performed with respect to aggregation.
Left join
When you use a left join to combine data, a query is sent to the database where the join is performed. Using a left join returns all rows from the left table and any rows from the right table that has a corresponding row match in the left table. The results of the join are then sent back to and aggregated by Tableau.
For example, suppose you have the following tables. If the common columns are User ID and Patron ID, a left join takes all the data from the left table, as well as all the data from the right table because each row has a corresponding row match in the left table.
Data blending
When you use data blending to combine data, a query is sent to the database for each data source that is used on the sheet. The results of the queries, including the aggregated data, are sent back to and combined by Tableau. The view uses all rows from the primary data source, the left table, and the aggregated rows from the secondary data source, the right table, based on the dimension of the linking fields. Dimension values are aggregated using the ATTR aggregate function, which means the aggregation returns a single value for all rows in the secondary data source. If there are multiple values for the rows, an asterisk (*) is shown. Measure values are aggregated based on how the field is aggregated in the view.
You can change the linking field or add more linking fields to include different or additional rows of data from the secondary data source in the blend, changing the aggregated values.
For example, suppose you have the following tables. If the linking fields are User ID and Patron ID, blending your data takes all of the data from the left table, and supplements the left table with the data from the right table. In this case, not all values can be a part of the resulting table because of the following:
- A row in the left table does not have a corresponding row match in the right table, as indicated by the null value.
- There are multiple corresponding values in the rows in the right table, as indicated by the asterisk (*).
Suppose you have the same tables as above, but the secondary data source contains a new field called Fines. Again, if the linking fields are User ID and Patron ID, blending your data takes all of the data from the left table, and supplements it with data from the right table. In this case, you see the same null value and asterisks in the previous example in addition to the following:
- Because the Fines field is a measure, you see the row values for the Fines field aggregated before the data in the right table is combined with the data in the left table.
- As with the previous example, a row in the left table does not have corresponding row for the Fines field, as indicated by the second null value.
Comments
Post a Comment