Data Modeling in Qlikview
As you probably may have experienced, QlikView performs better when the data source is a perfect
star schema. However, many
times, a star schema is not feasible or preferred given the requirements of the BI
application, the amount of time budgeted for development, or the resources of the team tasked with building an app. There are often challenges
in dealing with dimensions that have multiple granularity levels and large
data volumes or multiple fact tables that have different dimensions and/or multiple
relationship levels.
Below are a few tips for getting the most out of your QlikView data model.
1. Having a star schema modeled in the source data
does not mean you will automatically get a star schema in QlikView.
When the source data are from tables that
have more than one common column, QlikView will end up adding synthetic keys
to tables and create a more complicated data model when it translates relational data into associative
data. Common columns here are meant to
be those columns having identical column names.
The scenario below looks at a
simplified case where 1 fact table and 4 dimension tables from the source are in
a relational database. The relationships are mapped out nicely in
diagram 1.
Diagram 1. Sales Database Diagram
However, importing the data as is
into Qlikview yields the data model shown in Diagram 2. As you can see, not only it is not a star
schema any more, but also there are multiple synthetic keys that are going to
negatively impact the QlikView application front end features and performance.
Diagram 2. Qlikview data model (after a star schema data
source is loaded as is)
The problems is that there are tables that use two
key columns to define their relationships.
For example, both SalesPersonID and TerritoryID columns are used to
define the relationship between the Sales fact table and the SalesPerson dimension
table. Similarly, both CustomerID and
TerritoryID are used to define the relationship between the Sales fact table and
the Customer dimension table. Qlikview created three synthetic keys to
resolve the same relationships.
The solution in this case can be concatenating the common
keys to produce an unique key:
1)
Concatenate SalesPersonID and TerritoryID to resolve the relationship between
Sales fact and SalesPerson dimension.
2)
Concatenate CustomerID and TerritoryID to resolve the relationship between
Sales fact and Customer dimension.
As a result, QlikView data model is now a simple star
schema as shown in diagram 3.
Diagram 3. QlikView
data model after keys are redefined
2. Having a star schema modeled for source data
does not mean that it is the best way to model in Qlikview
When source data from a relational
database is modeled as a start schema, it is not necessarily that
a star schema is the
right data model for QlikView applications.
The simple star schema in the
scenario above is rarely implemented in real QlikView applications because most of
the BI applications have to deal with more than one fact table in
the same application, horizontal hierarchy, or more than one level
of relationships for associated dimensions.
QlikView can work with data
sources from different databases. When
the source data are brought in from different databases each modeled with a
star schema, star schema is not necessarily the data model feasible or best suited
for the QlikView application. When there
are more than one fact table each with multiple dimensions, the best option is
to consolidate all facts into one table if possible, which often resulted in snow
flake instead of star schema. If it is not possible to consolidate all facts
needed into one fact table, a link table may have to be used, which will often
resulted in a modified star schema or snow flake data model.
3. Choosing the best data model for Qlikview applications
A simple star schema will usually
result in the best performance for Qlikview applications. However, when denormalizing dimensions in
order to achieve star schema resulted in millions of repeated data in the
dimension table(s), the impact of that on the performance may overweight the
benefits of a star schema. Then, it is
time to consider snow flake instead. If
multiple fact tables can be consolidated into one fact table, even if it resulted
in a snow flake, usually it is better than a link table that is a modified star
schema unless the link table is very simple, in which case, a link table is
probably not necessary to start with.
Compared to the traditional relational databases, Qlikview data
modeling can be implemented quickly and modified fairly easily. If you are interested in data modeling in
Qlikview or in designing a Qlikview application, Qlik Tech offers free personal
edition Qlikview that can be downloaded
free.
There are also resources available at Qlikview Community and free training
from Qlik Tech for beginners.
Labels: BI, Data Model, Qlikview