Skip to main content

Long waiting times are an irritation and a cause of much frustration when working with Tableau. In this case study, we look at how to get the performance of your Tableau data source back under control – even if you're using a live connection or a complex data model.

A few benchmark figures for comparison purposes

If you're looking for suggestions on how to improve the performance of Tableau data sources, you'll quickly find what you're looking for on the internet. Unfortunately, the solutions offered tend to be very vague and lacking in information about the use case. For comparison purposes, therefore, we'd like to show a few benchmark figures from a textbook reference project:



The Tableau data source shown above was first created using Tableau Desktop 2019.1.2 and then shared with a wider user base via Tableau Server. Many operations, such as setting filters or dragging attributes into the report, for example, frequently took over 60 seconds and in extreme cases as long as 10 minutes. The limited storage capacity of the server ruled out the creation of an extract.

How to fix performance bottlenecks

To get an idea of exactly where performance bottlenecks are occurring, it's useful to have the Tableau Performance Recorder enabled as you interact with your workbook. This is only available in Tableau Desktop, however.

In our example, the following steps led to performance improvements

Switching to "logical data source" available in Tableau from version 2020.2

The source developer can exert only a limited influence over the SQL queries generated by Tableau. These depend mainly on the operations in the workbook and are generated dynamically by the Tableau Engine. Prior to version 2020.2, the Tableau Engine always queried all the tables connected in the data source even if only a small percentage of the tables were actually needed in the workbook. Thanks to the logical level introduced in Tableau 2020.2, only those objects are queried that are actually needed. This makes the generated SQL queries much leaner.

Set performance options

Performance options is another feature in Tableau 2020.2 that can be set in the relationships between tables, allowing the developer to optimize the generated SQL queries.



 

The general rule here is: the more precise, the better. "Many-to-one" is preferable to "many-to-many". "All records match" is better than "some records match". The settings stop the Tableau Engine from carrying out unnecessary grouping and aggregation and this improves the performance of queries.

A little tip to close

If you use the same query in a source several times (to query all history objects, for example), it is worth writing this as a custom query and adding an "order by false" to it at the end. In many databases, such as Exasol, this enforces a materialization of the result, which has a positive effect on the performance of repetitive queries.

Results

Using a reference workbook, we were able to identify performance improvements in numerous areas. For example, calculating the total for a key figure previously took an average of 40 seconds but now takes only about 3. Even after creating a new data source, we could see the performance decreasing as the number of activated filters in the workbook increased. This may appear controversial at first glance but can be explained by looking at the way filters work in Tableau. Solving the problem with context filters:





When you set a context filter, filters are no longer computed independently but are applied only to the data that remains after the context filter has been created.

Would you like to learn more about Tableau? Learn here how to use Tableau Prep for data preparation.