Ticker

6/recent/ticker-posts

Tableau Performance Tuning

Hello Folks,

Below are the steps which assisted me achieving better performance. I would request you to go through each point. Please do not skip any point. Keep this blog interactive and share your inputs.




1. Cache
:

This setting is extremely useful in case of Live connection. If the data is available in the cache, the successive visit will pull the data from cache, not from the database. This option can be configured at the server/site level. More on this is available at:
https://help.tableau.com/current/server/en-us/config_cache.htm

2. Filter on Extract:
I have noticed people getting confused with data source filter with extract filter. Below is data source filter. For extract filter you need to click on edit and tell Tableau hyper engine how much data to extract. The extract refresh will only pull the data specified here.


3. Hide all Unused Fields while creating extract:
If this option is not selected Tableau will query database for the all the columns. Selecting this option will reduce the size of the extracts and thus improving the performance significantly.


4. Live vs Extract:
Needless to mention to use extract when the data set is large and initial load of the view matters. However, don’t blindly extract all the data source and refresh it every 15 minutes in case of real time statistics. Create extract when you see performance issue. The live dashboard will use the resources only when it is opened, whereas Extract will use the resource as per the schedule (every 15 minutes in this case)

5. Careful with Very Large Running Extracts:
By default, the schedules are executed in parallel. Parallel schedules use all the available backgrounders as an extract refresh can be divided into multiple tasks (one task one backgrounder). This becomes a bottleneck in case of large running extracts. It can prevent the other schedules from running. So such extracts can be scheduled serially in order to keep other backgrounder free, or when server is not busy.

 6. Choosing Extract Refresh time wisely:
A common mistake is made when we schedule most of the Extract Refresh simultaneously. As a result, the database gets flooded with multiple requests. Even if the they execute in parallel, some of the extract end up waiting in the queue while other takes longer to refresh. There will be scenarios where the request will get timed-out. So identify large running extracts, execute them in different schedules. You can identify this in sample performance workbook (when a job is scheduled and when it actually runs)

7. Incremental Refresh:
If your existing data is not updated use incremental refresh or use incremental refresh and if data is updated go for a full refresh.

8.  Frequency of Refresh:
Do you really need to refresh extract all day long or on every weekday or during peak-hours?

9.  Close data source:
Many a time we change/replace the data source and forget to delete/close the unnecessary ones. Even if the data source is not getting used in the report, it will query the database. Make sure to close all the unwanted data sources. 

10Network:
The rendering of the view depends greatly on the network. The view can render faster in one network compared to another network. If a user complains about performance, check which network he is connecting to and involve your network team if required. 

11. Rendering:
Servers have more computing power than client. Complex views render faster on the server side. Less complex views are rendered faster on the client browser. More on this is available at:
https://help.tableau.com/current/server/en-us/browser_rendering.htm 

12. Hide all Unused Sheets: Tableau loads all the visible sheets before displaying the view. All visible sheets contributes to the size of the workbook. Large workbook takes longer to load. So even if you are only publishing the dashboards, make sure to hide all the sheets.

13. Initial Load matters: It can be very frustrating for the users to keep waiting for the initial view of the workbook. All views/elements must load before Tableau displays the first view. The more the views, the longer it will take to load.

14. Apply button: Always use apply button (multi value selection) to prevent multiple refreshes.

15. Design basics: Never count anything in Tableau and push all the calculation in the backend. 

16. Restrict Data: Do you really need data for these many years? Are you doing any analysis on the previous years? Can we just show 4 years data in the dashboard? Ask these questions to the users.

17Backgrounder/VizQL and Data Engine Isolation:
All these processes are CPU intensive. There are chances of resource conflict. It is advisable to keep backgrounder on a separate node. You might not have this kind of deployment therefore, follow this practice when your server is too slow, extracts take longer time to finish, initial load time of the view is too long. Moreover, you should have sufficient processes for each of these services. More on this is available at:
https://help.tableau.com/current/server-linux/en-us/server_baseline_config.htm 

18Server Status console:
There could be some problem with the data source also. The administrative console can point to a problematic data source (if there is frequent failure). Likewise they are many useful information available in server status console. 

19. Automatically Suspend Extract Refresh Tasks: There will be many workbooks which are not getting utilized or became less important over the period of time. You may stop the extract refresh of such workbooks. You can resume it anytime. More on this is available at: 

20. Node Roles: Do you know that 
Backgrounder and File Store service on each node in your cluster can be configured to perform specific tasks. However, this feature requires  Data Management license and Advanced Management license  along with Tableau Prep Conductor. More on this is available at:
https://help.tableau.com/current/server/en-us/server_node_roles.htm
 
21. Workbook Performance after a Scheduled Refresh: This option will help you reduce the initial load time of the dashboard. More on this is available at:
https://help.tableau.com/current/server/en-us/perf_workbook_scheduled_refresh.htm
 
22. Tableau Server Notifications: Configure email so that you will be notified in case of extract failure, s
erver status changes, disk space. More on this is available at:

23. Fine tune extract query workload: If you are analyzing large volume of data or dashboard load time fluctuates or there is resource contention or you have federated data source, you must fine tune it using below recommendation:

24. Treat tableau as visualization tool: Lastly educate users to treat Tableau as visualization tool and for self-service analytics, not as enterprise reporting tool or website. 😊

Bonus:

  1. Before you start making any change it is recommended to use sample performance workbook. Below is the link:
    https://help.tableau.com/current/server/en-us/perf_analyze_sample_workbook.htm

  2. In addition, you can get information about Tableau Server Repository. Refer below link:
    https://help.tableau.com/current/server/en-us/perf_collect_server_repo.htm

  3. Performance Monitor for Windows installations, and sysstat or vmstat tools for Linux installations to determine resource usage of Tableau.

  4. Always follow the hardware and software recommendation from Tableau. It is advisable to get in touch with Tableau before planning any upgrade.

  5. Upgrading to the newest version may boost performance without needing anything else.
Regards,
Piyush Narayan

Post a Comment

4 Comments