Slowly changing dimensions

Hello,

I suggest a new action that would handle SCD types dimensions in easymorph. I know it's possible to make it with a few actions but if a single one could handle every required transformation we could save a lot of time and make it much more easier.
A very nice feature would be to be able to select a field in the target and, based on one or more key fields, compare it's value to the source.

I would then let those options to the user:

keep the original value (SCD0),
update with the new value (SCD1),
create a new row along with setting start and end date on both existing and new record, and set a "IsCurrent" flag (SCD2),
keep history in an additional column (SCD3),
"move" the record to a derived table (SCD4),
combine both SCD2 and SCD3 concepts (SCD6).

But within all of those and based on my experience, I would mostly use SCD0, 1 and 2. While 1 and 2 are easy to implement with easymorph, the type 2 is in my opinion the most tricky.

Best regards,
Christophe

Check out the "Interval merge" action. It's intended exactly for merging SCDs.

As far as I understand SCDs, an "IsCurrent" flag is not necessary, because it's the date range that specifies whether a value in SCD is attributable to a given date/time or not.

All in all, building an SCD requires adding rows with the start dates of validity. E.g.:

Region Effective as of
North 2023-09-01
East 2024-02-12
North 2024-05-13

To obtain a proper table for use with "Interval merge", apply the "Shift columns" action to bring the next effective date in the same row as the current effective date, so that it becomes the end of the validity period for the value in that row. That will give you a table with date columns that specify the validity term of a value. Such a table can be used with "Interval merge".

All in all, it looks to me like creating and using SCDs could be done relatively easily and without using purpose-built actions. Am I missing something?

Hi Dmitry,

Thanks for your answer. Seems I should have post a "how to" question instead of a new feature request. I was aware about the interval merge, not much about the shift column action action in order to achieve this. Indeed, the "IsCurrent" is a nice to have in a DWH architecture, but dealing with "is null end date" or setting a "2099-12-31" as end date to the current record will help in situations where you want the current value to be carried over the time.

I will try this and I'll let you know if I have any problem or question.

Best regards,
Chris

1 Like