Dimensions and Facts
We are basically continuing with our data modelling concepts in this post (refer last post Dimensional Modelling in brief). Here we will focus on brief intro of Dimension and Facts which can explain goal of Dimensional Modelling.
As per basic definition of Dimension, it’s a perspective to look for a fact. Yeah that’s it, so in above examples perspectives are Geography, Clients, Products, Branches and Time and so are actually dimensions. We decide a dimensions on the basis of our point of view, it’s a job of data modeler to say what will be dimension in cube based on domain and reporting requirements.
Notes on Dimensions:
- Dimensions tables will be having columns providing same information.
- Time dimension will have column as Date, Month, Quarter, Year etc.
- Geography will have column as City, State, Country etc.
- Client will have Client Name, Client number, Address, Phone Number etc.
- Along with the columns providing information on a dimension, these table will have an ID column. Values in ID column should be unique for sure. For e.g. w.r.t. above example DateID, CityID, ClientID respectively are ID columns in mentioned dimension tables.
- Values of Dimension tables are obtained by all possible values for the columns in entire staging database. For e.g. All the tables will be joined for getting information of Date whether it could be from Due Date, Shipping Date, etc. But not a single date or year or month should be left covered in Dimension table of Time.
- Usually we name Dimension tables as DimTime, DimGeography, DimClient etc.
Facts or Measures:
Measures are real data (in numbers) which are measurable and calculate-able. Well it’s not difficult to understand that a measure is the value which are required in addition, aggregations, calculations, statistics, scores etc. So any data value required in end result or say cube is called measure, whereas group of correlated measures are called facts.
e.g. Inside Adventure Works data warehouse, we call Internet Sales and Reseller Sales as measures and both measures are in same measure group Sales which can be called as Sales Fact.
Notes on Facts:
- Fact tables will have two categories of columns
- Columns for measures like Sales Amount, Internet Sales Amount, Product Count etc.
- Columns for IDs of Dimension tables which will be used to relate to the Fact table. For e.g. DateID, CityID, ClientID
- Fact tables stores all the business logic of the project, and while creating columns for measures we put filters and conditions required.
- Fact table should have no extra ID in any column which is not present in respective Dimension table. E.g. FactSales should not have any SalesDateID which is not part of DimDate table
- Fact tables are usually named as FactResellerSales, FactInternetSales
Now it’s quite easy to understand that Dimension table have one column as ID and same ID is a column in Fact Table to relate these two tables. So here we make the foreign key relationship for same Dimension and Fact tables.
In the following diagram we can see a small Dimension and Fact model for the examples we were talking about.
Posted on October 6, 2012, in Begin BI and tagged Cube, Data warehouse, Dim tables, dimension, dimension modelling, dimension table, fact table, fact tables, Facts, Measures, what is dimension and facts. Bookmark the permalink. Leave a comment.