Friday, June 10, 2016

Dimensional Modeling


Information about business situations helps managers make decisions. Identifying and selecting a course of action to deal with a specific problem or taking advantage of an opportunity are decision-making tasks.
The faster information reaches managers, the faster a decision is made. This means profit, competitiveness and a series of advantages that can certainly help decide the future of a company.
It is through knowledge that managers make their decisions, and to make this as effective as possible, information systems were created to support them.
Transactional systems were unable to perform the task of analysis for decision-making by simply generating reports. In this context, the data warehouse emerged and became a reality in large corporations.
To build a DW (Data Warehouse), one technique used is dimensional modeling. In the same way that civil engineers draw the blueprint of a house before putting up its walls, it is necessary to design the data warehouse before building it. A dimensional model is basically formed by a central fact table and dimension tables directly linked to it.
Some important concepts:
  • Fact: is an event worthy of analysis and control in the organization;
  • Dimension: presents a perspective of the data. Ex.: per day, per salesperson, per customer, per product, per agency;
  • Metric: are the values ​​to be analyzed, measured. Ex.: total sales, percentage of absenteeism.
There are two types of modeling for a Data Warehouse: “Star Schema” and “Snow Flake”.
“Snow Flake” is composed of a fact table, dimensions and sub-dimensions. The sub-dimensions are normalizations of the dimension tables, which reduces the number of records and consequently requires less disk space. However, since it is normalized to display a query, several “joins” will be necessary, which results in slow query speed.
“Star Schema” is characterized by denormalized dimension tables. In other words, dimension tables in which data is repeated in several rows. This characteristic requires a lot of disk space. The star model, being denormalized, reduces the number of “joins” between tables, which results in faster queries. Since it has fewer tables compared to the “snow flake” model, it can be said that maintenance is simpler.

No comments:

Post a Comment