Skip to main content

Data Blending in Tableau

All about Data Blending in Tableau

Prerequisites for data blending
Differences between joins and data blending
Blend your data
Data blending limitations
Data blending is a method for combining data that supplements a table of data from one data source with columns of data from another data source.
Usually you use joins to perform this kind of data combining, but there are times, depending on factors like the type of data and its granularity, when it's better to use data blending.
For example, suppose you have transactional data stored in Salesforce and quota data stored in an Excel workbook. The data you want to combine is stored in different databases, and the granularity of the data captured in each table is different in the two data sources, so data blending is the best way to combine this data.
Data blending is useful under the following conditions:
  • You want to combine data from different databases that are not supported by cross-database joins.
    Cross-database joins do not support connections to cubes (for example, Oracle Essbase) or to some extract-only connections (for example, Salesforce). In this case, set up individual data sources for the data you want to analyze, and then use data blending to combine the data sources on a single sheet.
  • Data is at different levels of detail.
    Sometimes one data set captures data using greater or lesser granularity than the other data set.
    For example, suppose you are analyzing transactional data and quota data. Transactional data might capture all transactions. However, quota data might aggregate transactions at the quarter level. Because the transactional values are captured at different levels of detail in each data set, you should use data blending to combine the data.
Use data blending instead of joins under the following conditions:
  • Data needs cleaning.
    If your tables do not match up with each other correctly after a join, set up data sources for each table, make any necessary customizations (that is, rename columns, change column data types, create groups, use calculations, etc.), and then use data blending to combine the data.
  • Joins cause duplicate data.
    Duplicate data after a join is a symptom of data at different levels of detail. If you notice duplicate data, instead of creating a join, use data blending to blend on a common dimension instead.
  • You have lots of data.
    Typically joins are recommended for combining data from the same database. Joins are handled by the database, which allows joins to leverage some of the database’s native capabilities. However, if you're working with large sets of data, joins can put a strain on the database and significantly affect performance. In this case, data blending might help. Because Tableau handles combining the data after the data is aggregated, there is less data to combine. When there is less data to combine, generally, performance improves.
    Note: When you blend on a field with a high level of granularity, for example, date instead of year, queries can be slow.

Prerequisites for data blending

Your data must meet the following requirements in order for you to use data blending.

Primary and secondary data sources

Data blending requires a primary data source and at least one secondary data source. When you designate a primary data source, it functions as the main table or main data source. Any subsequent data sources that you use on the sheet are treated as a secondary data source. Only columns from the secondary data source that have corresponding matches in the primary data source appear in the view.
Using the same example from above, you designate the transactional data as the primary data source and the quota data as the secondary data source.
Note: Cube (multidimensional) data sources must be used as the primary data source. Cube data sources cannot be used as a secondary data source.

Defined relationship between the primary and secondary data sources

After designating primary and secondary data sources, you must define the common dimension or dimensions between the two data sources. This common dimension is called the linking field.
Continuing the example from above, when you blend transactional and quota data, the date field might be the linking field between the primary and secondary data sources.
  • If the date field in the primary and secondary data sources have the same name, Tableau creates the relationship between the two fields and shows a link icon () next to the date field in the secondary data source when the field is in the view.
  • If the two dimensions don’t have the same name, you can define a relationship that creates the correct mapping between the date fields in the primary and secondary data sources.

Differences between joins and data blending

Data blending simulates a traditional left join. The main difference between the two is when the join is performed with respect to aggregation.

Left join

When you use a left join to combine data, a query is sent to the database where the join is performed. Using a left join returns all rows from the left table and any rows from the right table that has a corresponding row match in the left table. The results of the join are then sent back to and aggregated by Tableau.
For example, suppose you have the following tables. If the common columns are User ID and Patron ID, a left join takes all the data from the left table, as well as all the data from the right table because each row has a corresponding row match in the left table.


Data blending

When you use data blending to combine data, a query is sent to the database for each data source that is used on the sheet. The results of the queries, including the aggregated data, are sent back to and combined by Tableau. The view uses all rows from the primary data source, the left table, and the aggregated rows from the secondary data source, the right table, based on the dimension of the linking fields. Dimension values are aggregated using the ATTR aggregate function, which means the aggregation returns a single value for all rows in the secondary data source. If there are multiple values for the rows, an asterisk (*) is shown. Measure values are aggregated based on how the field is aggregated in the view.
You can change the linking field or add more linking fields to include different or additional rows of data from the secondary data source in the blend, changing the aggregated values.
For example, suppose you have the following tables. If the linking fields are User ID and Patron ID, blending your data takes all of the data from the left table, and supplements the left table with the data from the right table. In this case, not all values can be a part of the resulting table because of the following:
  • A row in the left table does not have a corresponding row match in the right table, as indicated by the null value.
  • There are multiple corresponding values in the rows in the right table, as indicated by the asterisk (*).



Suppose you have the same tables as above, but the secondary data source contains a new field called Fines. Again, if the linking fields are User ID and Patron ID, blending your data takes all of the data from the left table, and supplements it with data from the right table. In this case, you see the same null value and asterisks in the previous example in addition to the following:
  • Because the Fines field is a measure, you see the row values for the Fines field aggregated before the data in the right table is combined with the data in the left table.
  • As with the previous example, a row in the left table does not have corresponding row for the Fines field, as indicated by the second null value.


Blend your data

You can use data blending when you have data in separate data sources that you want to analyze together on a single sheet. The following example demonstrates how to blend data from two data sources: an Excel data source and an SQL Server data source.

To blend your data

Data blending limitations

There are some data blending limitations around non-additive aggregates, such as COUNTD, MEDIAN, and RAWSQLAGG. For more information, see Troubleshoot Data Blending.

Comments

Popular Posts

Add Space between bars in Tableau chart

Add Space between bars in Tableau chart Scenario: Tableau defaults to no spacing between the panes in a view. How do I get some spacing between groups of bars in my charts? It can be achieved through the steps below: Add subtotals (Analysis->Totals->Add All Subtotals) Right click on the measure pill on the Rows shelf and change the default SUM() aggregations to MIN() To edit the color legend – double click on the Total color, this takes you to a color dialog, select the WHITE color, click OK Right click on the word Total in the X axis and select Format In the Format window, click the Header tab and blank out the Total label field The added space looks like below:

Adding “Apply” Button to A Filter Menu In Tableau

Adding An “Apply” Changes Button To A Filter Menu In Tableau Sometimes when a user is changing the options they want on a filter in Tableau, the chart updates as they change each option. This might not be the best user experience if they are changing many options and don’t want the dashboard to redraw the view until they have completed their selection. It is very simple to fix this – Tableau includes an “apply” changes option, which when enabled, means the dashboard won’t redraw to reflect the new filter choices until the user presses the “apply” button. Simply click the drop down arrow on the filter menu > go to “customize”, then click “show apply button”.  

Error "Cannot mix aggregate and non-aggregate arguments with this function"

Error "Cannot mix aggregate and non-aggregate arguments with this function" When Creating a Calculated Field When creating a calculation, the following error might occur:  "Cannot mix aggregate and non-aggregate arguments with this function." The issue can be resolved through one of the below options: Modify the calculation so that all fields are either aggregate or non-aggregate. Each option can result in different values (please reference the additional information section for specific examples). Option 1 (Aggregate Then Divide) Wrap all fields in an aggregation. Sample: [Profit] / SUM ([Sales]) -> SUM ([Profit]) / SUM ([Sales]) Option 2 ( Divide Then Aggregate) Remove aggregations from all of the fields. Sample: [Profit] / SUM ([Sales]) ->[Profit] / [Sales]   Option 3 ( Condition Then Aggregate) Move the aggregation so all fields are aggregated. For example, the calculation: IF [Row ID] = 1 THEN SUM( [Sales] ) END  could...

Removing "Abc" Placeholder Text in Tableau Measures column

Removing "Abc" Placeholder Text in Tableau Measures column Quick and easy fix for this issue could be: Use Polygon mark type On the Marks card in the dropdown menu, select Polygon Resize the last column to make it smaller Navigate to Format > Borders In the left-hand Format Border Pane, for Column Divider, For Pane, select None from the dropdown menu Navigate to Format > Shading In the left-hand Format Border Shading, for Row Banding, move the slider to the desired level of row banding

Tableau Bar Chart Rounded corners

Tableau Bar Chart Rounded corners Follow the steps in the video below to create rounded corners for the bar chart:

Using Special Characters in Tableau URL Parameters

Using Special Characters in Tableau URL Parameters When we try to use special characters in URL parameters, the URL parameter might not do anything, or an error might occur. The issue can be resolve by:  Use one of the following workarounds: Replace the special character with the URL encoding sequence for backslash (\) (%5c) followed by the URL encoding sequence for the special character. The backslash is needed to escape the special character. For example, the URL encoding sequence for backslash and comma (\,) is %5c%2c.  In the data source, separate comma-delimited field values into separate columns that can be filtered independently. In Tableau Desktop, use a calculated field to replace the special characters, such as commas or spaces, with hyphens (-). Cause The browser cannot parse the special characters used in the URL. Additional Information The error varies depending on the browser and special character being used. Networ...

Cannot mix aggregate and non-aggregate arguments with this function

Cannot mix aggregate and non-aggregate arguments with this function Issue When creating a calculation, one of the following errors might occur:    "Cannot mix aggregate and non-aggregate arguments with this function." (Option 1,2,3 or 4 can be used). " All fields must be aggregate or constant when using table calculation functions or fields from multiple data sources." (Option 1 or 3 can be used). "Argument to sum (an aggregate function) is already an aggregation, and cannot be further aggregated ."  (Option 2, 3 or 4 can be used). Environment Tableau Desktop Resolution Each option can result in different values (please reference the attached workbook in the right-hand pane and additional information section for specific examples). Option 1 (Aggregate All Fields) Wrap all fields in an aggregation. Sample: [Profit] / SUM ([Sales]) -> SUM ([Profit]) / SUM ([Sales])   Option 2 ( De-aggregate All Fields) Remove aggregations from all of ...

Set a Date Filter Default to Max Date in Tableau

Set a Date Filter Default to Max Date in Tableau Some of us might have not noticed the setting a   filter to default to the Max date. There is a quick and simple option to achieve this in Tableau filters. Follow the steps below: 1. Create a date filter with the date dimension. 3. Drag the date dimension to filters pane. 4. Edit the date filter to Select from the list and select the max date available. 2.    Now, y ou have the option  'Filter to latest date value when workbook is opened', check this box and click on OK.

Tableau Extract Update Error "Timeout Error: IPC_NamedPipe::Select(WaitForMultipleObjects)

Tableau Extract Update Error "Timeout Error: IPC_NamedPipe::Select(WaitForMultipleObjects) When trying to update the Data Source or w hen attempting to connect to a data source, or create and/or refresh extracts with Tableau Desktop, the following error message occurs:  IPC_NamedPipe::Select(WaitForMultipleObjects): Timeout. Cause Anti-virus software is blocking Tableau processes from running, or the Logs folder contained corrupted information.  Resolution Option 1: If it is easy w ork with IT helpdesk/support to add below Tableau folders and processes to exclusions : Tableau.exe hyperd.exe hyperdstarter.exe *.hyper extension (if AntiVirus has an option for extension exclusions) C:\Users\<username>\Documents\My Tableau Repository C:\Users\<username>\AppData\Local\Temp\TableauTemp C:\Users\<username>\AppData\Local\Tableau C:\Program Files\Tableau Make sure that child folders are included in the exclusions. AntiVirus prog...