When the data warehouse receives notification that an existing row in a
dimension has in some way changed, there are three basic responses.
We call these three basic responses Type 1, Type 2, and Type 3 slowly changing
dimensions (SCDs).
Type 1 SCD
In Type 1 SCD, we simply overwrite the existing row. It means we don’t maintain history here.
e.g.
Primary Key Natural Key Prod Name Category Package Type
4321 AT04 Sprite Soft Drink Glass
4321 AT04 Sprite Soft Drink Plastic
Type 2 SCD
In Type 2 SCD, history is maintained accurately in the dimensions, and the changes are properly associated with the related facts.
Here when the database is notified the existing dimension row needs to be changed, instead of overwriting the changes, the
database inserts new row at the moment of change.
This new record is assigned a new surrogate primary key and this primary key is used in all the fact tables for which this dimension is
a foreign key.
Type 2 SCD perfectly partitions history because each detailed version of a dimension entity is correctly connected to the span of fact table records.
Kimball recommends dimension tables to provide optional useful information about Type 2 dimension changes.
He recommends adding the following five fields to dimension tables processed
with Type 2 logic:
1. Calendar Date foreign key (date of change)
2. Row Effective DateTime (exact date-time of change)
3. Row End DateTime (exact date-time of next change)
4. Reason for Change (text field)
5. Current Flag (current/expired)
Primary Key Natural Key(Emp ID) Designation Department Calendar Date Start Date End Date Reason For Change Current Flag
4321 E00234 Grade I Sales 31-Mar-2008 04-Apr-2007 31-Mar-2008 Joining 0
4322 E00234 Grade II Sales 31-Mar-2009 01-Apr-2008 31-Mar-2009 Promotion 0
4323 E00234 Grade II Marketing 01-Apr-2009 01-Apr-2009 31-Mar-2099 Dept Changed 1
Kimball also recommends not to set the Row End DateTime value to NULL as it may give erroneous results when used in a BETWEEN logic.
This value can be set to any arbitrary far future value.
Type 3
Type 3 SCD is used when a change happens to a dimension record but the old value remains valid as a second choice.
In Type 3 SCD, instead of issuing a new row when the change takes place a new column is added to the table and
the old value is placed in this column before it is overwritten.
e.g.
Primary Key Natural Key(Prod ID) Prod Name Size Category Colour
1127648 A 107B Denim Pants 30 men’s Wear Blue
Primary Key Natural Key(Prod ID) Prod Name Size Category Old Category Colour
1127648 A 107B Denim Pants 30 Leisure Wear Men’s Wear Blue
No comments:
Post a Comment