The master data source model or MDS for short is a methodology I developed to help save a lot of time when building dashboards in Tableau.
I believe the methodology is not limited entirely to Tableau and dashboard construction and can be a powerful way to work as an analyst.
In this post I'm going to introduce you to the model, walk you through an example and cover some of the pros and cons.
What is the MDS mode?
Imagine you’re working for a company like Mailchimp. In the world of Mailchimp you have several important entities. Users, subscribers, campaigns, lists and templates are just some of these entities.
As an analyst for Mailchimp much of your work will revolve around these main entities.
To make your life easy you should create robust data sources for each of the main entities in your data warehouse, or in the case of Tableau, as web-hosted data sources.
If you do a lot of work directly with your data warehouse, then you should create “master views” which can be used by all the SQL literate data consumers in the organization.
If you adopt the MDS model then you'll focus on a few robust data sources and views to support your needs, instead of creating new views and data sources each time you run an analysis or build a data visualization.
Example of an MDS: MDS – Users
Almost every client of mine that users Tableau now has an MDS – Users data source in their Tableau Online account.
This data source becomes the main data source for the user dashboard. Below is a diagram showing an example of a users MDS.
In the diagram above you see the user entity in the middle and the most important dimensions being connected. The obvious fields associated with an entity will be its' id, creation timestamp / date, foreign keys etc.
Once you've added these obvious fields you'll want to add additional, more complex fields to your data source.
When you're done building out your MDS you should end up with a robust table of data with each row representing the entity you're focusing on.
Benefits and disadvantages of the MDS Model
I've been using the MDS model for years and I've found it to work very well. The main benefit of this model is the time it saves you.
Once you've built a few MDS's most analytical work can be completed significantly faster since you can skip the data prep work.
In Tableau this model is a game changer since you can simply download the MDS to your local machine, make some adjustments and republish it. The value of the data source grows as you add to it and everyone can benefit.
The biggest disadvantage of this model is that it starts to fail once the queries used to build the MDS cant handle the volume and/or complexity of the data source. If your data sources grow horizontally (you're adding a lot of columns) aggressively you may struggle.
I've seen MDSs that are over a gig in size work but you'll need a lot of patience and start looking for alternative approaches once you've reached that point.
Summary
The MDS model can be a very useful approach to prepping your data for use by analysts and data consumers in your organization.
If you're using Tableau then I highly recommend the MDS model. I used the model to support a large organization with over 60 consumers.
The MDS model has some disadvantages. If you're working with very large data sets then I'd advise using a hybrid approach of smart ETLs, scripts and the MDS model.
How are you structuring the data sources in your company? Let me know in the comments section below.