Best Practices for Data modelling in QlikView

In QlikView, the best practices for data modeling deals with maintaining a well-structured data model and suitable to enhance data processing and analysis. Data modeling is a process of organizing data from various data sources to a single design schema that helps to analyze the combined data. It is a collection of dimension and fact tables that are connected through common fields. We can slice the data quickly in many ways by having a proper data model.

In this blog, we will learn the following:

Importance of Data modeling

Best Practices for Data modeling

 1. Importance of Data Modeling

 QlikView performs a better way when the data model is well designed and structured. A good data model ensures the quick data process, evaluates expressions, and provides accurate results because the data is arranged properly so that system takes a logical way to load and evaluate the data values. 

 As we know, the data model consists of dimension and fact tables that are connected with the fields. There are often challenges in dealing with dimensions that contain multiple granularity levels and large data volumes or fact tables. The performance of the Data model depends on how data is loaded into the QlikView document and also depends on how data is structured to achieve the best way of the data model.

If you would like to become a Qlik Sense Certified professional, then visit Mindmajix - A Global online training platform: " Qlik Sense Online Course ". This course will help you to achieve excellence in this domain.

 2. Best Practices for Data modeling in QlikView

 In general, the data loaded into the QlikView document can be explained by the ETL process. But, it is essential to create a data model that enables efficient data handling methods in QlikView. The normalized star schema or Snowflake schema without any circular references where every entity is stored in separate tables.

 A central fact table contains keys to the dimensions and the numbers used to calculate measures such as the number of sales, units, etc. The surrounding connected tables contain the dimensions with all the attributes such as product, category, suppliers, etc. Each table in the data model corresponds to either Load or Select statement.

 The following are the Best practices for Data modeling in QlikView.                       

Combining Tables

 In QlikView, the Join and Keep prefixes are used to combine two tables that may or may not have common fields. The Join is a simple way to combine tables by using Join prefix in the script, that joins with the internal table or with another table or with an existing table. In QlikView, joins can be made in the script, producing logical tables. 

Example for Join: 

LOAD x, y, z from table1.csv;

Join LOAD x, v from table2.csv;

 The resultant internal table has the fields x, y, z, and v. The no. of records may differ depending on the field values from the two tables. 

 The Keep prefix is developed to reduce the no. of cases where the explicit joins need to be used. The Keep prefix between Select or Load statement has the effect of reducing tables to the intersection of table data before they are stored in QlikView. The Join and Keep prefixes must always be preceded by one of the keywords Inner, Left, or Right. 

 If the prefix Inner is used before Join, it specifies that the join between the two tables should be an inner join. The resultant table contains the combinations between two tables with complete data set from both sides. If the prefix Inner is used before the Keep, it defines the two tables should be deduced to their common convergence before being stored in QlikView. In the same way, the Left and Right prefixes are used to combine two tables as per their functions.

Related Article: Tutorial QlikView

Creating a Data Interval

 In a few cases, time intervals don’t store explicitly with a beginning and an end. But, they are implied by only one field- the change timestamp. For example, when you have currency rates of multiple currencies. Each currency rate changes with a new conversion rate. Also, the table contains rows with no dates to the initial conversion rate, before the first change was made. 

CurrencyChange rateRate
INR 7.59
INR26-01-20207.69
INR17-02-20207.45
USD 6.50
USD19-01-20206.56
USD12-02-20206.30

We can create a FromDate, and ToDate intervals using the script commands given below.

Let vBeginTime = Num('1/1/2020');

Let vEndTime = Num('1/3/2020');

Let vEpsilon = Pow(2,-27);

Tmp_Rates:

LOAD Currency, Rate,

 Date(If(IsNum([Change Date]), [Change Date], $(#vBeginTime))) as FromDate

From 'C:FilesRates.xlsx'

(ooxml, embedded labels, table is Sheet1);

Rates:

LOAD Currency, Rate, FromDate,

 Date(If( Currency=Peek(Currency),

  Peek(FromDate) - $(#vEpsilon),

  $(#vEndTime)

  )) as ToDate

 Resident Tmp_Rates

 Order By Currency, FromDate Desc;

 Drop-Table Tmp_Rates;

It updates the source table in the following manner.

CurrencyRateFromDateToDate
INR7.4517-02-2020vEndTime
INR7.6926-01-202016-02-2020 23:59:59
INR7.59vBeginTime26-01-2020 23:59:59
USD6.3012-02-2020vEndTime
USD6.5619-01-202011-02-2020 23:59:59
USD6.50vBeginTime18-01-2020 23:59:59

This table can be used in comparison with an existing date using Intervalmatch methods. 

MindMajix YouTube Channel

 

Related Article: Interview Questions for QlikView

Data Cleansing

When the data is loaded from different tables, the field values denoting the same thing are not always named consistently. It lacks consistency and also hinders associations, so the problems need to be solved. This can be done elegantly by creating a mapping table for the comparison of field values.

When several tables listing countries are loading, we may find that the same country has several names, for example, U.S.A is listed as US, U.S, USA, United States. To avoid the occurrence of different records denoting the same country name in the concatenates table, create a table similar to that and load it as a mapping table. The script should be as follows: 

CountryMap:

Mapping LOAD x,y from MappingTable.txt

(ansi, txt, delimiter is ',', embedded

labels);

Map Country using CountryMap;

LOAD Country, City from CountryA.txt

(ansi, txt, delimiter is ',', embedded labels);

LOAD Country, City from CountryB.txt


(ansi, txt, delimiter is ',', embedded labels);

The mapping statements load in the file MappingTable.txt as a mapping table with the label CountryMap. 

Generic Databases

The generic database is a table in which the field names are stored as field values in one column, as the field values are stored in a second. These databases are used for the attributes of different objects. For example, a generic database contains two objects, a box, and a ball. It has some attributes such as weight and color, which are common for both the objects, other attributes like height, length, width is not so. It doesn’t matter whether a Load or Select statement is used to load the generic database.

ObjectAttributeValue
boxcolorpink
boxheight18cm
boxweight500g
boxlength20cm
boxwidth10cm
ballcolorblue
balldiameter12cm
ballweight200g

Since many of the attributes are not relevant for a specific object so it is required to store the data giving each attribute a column of its own. On the other hand, it looks clumsy displaying it in a way that mixed lengths, weights, and colors. 

Syntax:

Generic SELECT* from GenericTable;

Hierarchies

Hierarchies are often used to represent things organization dimensions in data. These kinds of hierarchies are generally stored in the adjacent node table, in a table where each record correlates to a node and has a field that contains a reference to the parent node. 

IDParentNodeIDTitle
1-General manager
21Regional manager
32Branch manager
43Department manager
Hierarchy (NodeID, ParentNodeID, Title, 'Manager') LOAD

    NodeID,

      ParentNodeID,Title


FROM 'hierarchy.txt' (txt, codepage is 5212, embedded labels, delimiter is ',', msq);

In such a table, the node is stored on one record only but still has no.of children. The table may contain the additional fields defining attributes for nodes. The expanded node table is the common representation, where each level in the hierarchy is stored in a separate field. The hierarchy keyword can be used in the load script to transform an adjacent node table to expand. 

IDParentNodeIDTitleTitle1Title2Title3Title4
1-General managerGeneral manager---
21Regional managerGeneral managerRegional manager--
32Branch managerGeneral managerRegional managerBranch manager-
43Department managerGeneral managerRegional managerBranch managerDepartment manager

The problem with the expanded nodes tables is that it finds it difficult to use the fields for searches or selections, prior knowledge is required about the level to search or select. The tale is a different representation that solves this problem, and such representation is also known as a bridge table. It contains one file for every child-ancestor relation found in the data. The “hierarchybelongsto” keyword can be used in the load script to transform an adjacent node to an ancestor table.

Linking information to field values

In a QlikView document, the information can be in the form of images, text, or files that can be associated with data. To link information, tables have to be created that contain what information file is to be linked to a particular field value, and such tables are known as information tables. The information table includes two columns, one column is for field names containing a list of values, and the other is headed by an arbitrary name that contains information. The statement for loading this table as an information table is as follows:

Info LOAD Country, I from Flagsoecd.csv (ansi, txt, delimiter is ',', embedded labels);

When a list box or multi-box items are linked to the information is selected, an information icon occurs beside the name field to show the information is available. Using that icon, it enables to show the information or load an application file. It is possible to disable the icon on the Layout of the list box properties dialog.

Matching intervals to discrete data

The interval match prefix to a Select or Load statement is used to link the discrete numeric values to at least one numeric interval. It is done to maintain a good data model. The extended interval match is used for handling the problem of slowly changing the dimensions in the source data. The table to which the discrete data points will be matched should have at least two fields, i.e., start and end fields. The QlikView must read the discrete values before the function calls. If the intervals overlap in the table, they are linked with all corresponding or matching values.

Related Article: QlikView Intervalmatch

Semantic links are created by loading tables containing the relations between the objects. When clicking on a semantic link, a selection is made in other fields. The table must contain three or four columns. A semantic table must either contain the relation between field values of different fields or between field values of the same field. The Select or Load statement is used to load a semantic table must be preceded by the semantic qualifier that shows it is not a logical table.

Four columns are used in semantic links, the first column includes field value that has a relation with other fields, the second column consists of the name of the relation, the third column contains related field value. The fourth column contains names of the inverse relations. If three columns are used, the names of the inverse relations column are not required. 

Using mapping instead of joining

Join creates large size tables, and the processing of the large tables may slow down the tool. So instead of Join, the Mapping function works more efficiently. A mapping table consists of two columns, one is the comparison field, and the other is the mapping value field. For example, we have a table of orders and need to know the country of the customer that is stored in the customer table.

OrderIDOrderDateShipperIDFreightCustomerID
127112020-03-011253
127122020-03-011674
127132020-03-022302
127142020-03-031793

 

CustomerIDNameCountry...
1TechCoGermany...
2BusinessCorpItaly...
3DataSalesSpain...
4MobechoFrance...

To view the country of a customer, we need to map the table as shown in the below table. 

CustomerIDCountry
1Germany
2Italy
3Spain
4France

The mapping table that we name MapCustomerIDtoCountry, can be defined in the below script.

MapCustomerIDtoCountry:

     Mapping LOAD CustomerID, Country From Customers ;

The next is to apply the mapping function, using the ApplyMap function, 

Orders:

     S *,

   ApplyMap('MapCustomerIDtoCountry', CustomerID, null()) as Country

         From Orders ;

The applyMap function is used to define what to return when the value is not found in the mapping table.

The resultant table is given below:

OrderIDOrderDateShipperIDFreightCustomerIDCountry
127112020-03-011253Spain
127122020-03-011674Italy
127132020-03-022302Germany
127142020-03-031793France

 

Related Article: QlikView Metadata

Using QVD files for incremental load

Incremental load is defined as loading new or changed records from the database. The QVD files are recommended for better and fast loading of the incremental load. In most cases, it is possible to perform incremental load using QVD files. The basic process is given below:

Load new data from the database table (a slow process, but loading a limited no. of records.)

Load old data from the QVD file (It is a fast process, though it can load many records)

Create a new QVD file.

Repeat the process for every table loaded

This process is repeated whenever the incremental load is to load from the database. 

Explore QlikView Sample Resumes! Download & Edit, Get Noticed by Top Employers!

Conclusion

In QlikView, these are the best practices for Data modeling while developing the script. A good script results in a Good data model that provides better application performance. Try to apply these practices while developing the script. Hope you find relevant information.

Course Schedule
NameDates
QlikView TrainingNov 02 to Nov 17View Details
QlikView TrainingNov 05 to Nov 20View Details
QlikView TrainingNov 09 to Nov 24View Details
QlikView TrainingNov 12 to Nov 27View Details
Last updated: 03 Apr 2023
About Author

Keerthana Jonnalagadda working as a Content Writer at Mindmajix Technologies Inc. She writes on emerging IT technology-related topics and likes to share good quality content through her writings. You can reach her through LinkedIn.

read less