When do I update things via Slowly changing dimensions and when via Fact table?

I always come to a point where I do not know what to do if I model a DW.

Let’s take an example to make it easier:

We have a Star Schema with to fact table Orders. The Grain is on ordered Product per customer per order. The Fact Table has as a measurement called “Price”, which is the paid price for the product.

As dim tables we have
order
customer
status (ordered, shipped, delivered)
product

So one row in the fact table is the price (and for sure the foreign keys) of one product from one order from a customer.

So, how do we handle the status updates of an order? This will obviously change over the time from ordered to shipped and the to delivered. I have some solutions in mind, but always some points to disturb me. Does someone know what answer would be most common to use?

  1. Would we just add one row to the fact table with a timestamp? Then in the end we would have 3 rows (if we just consider the change of the status) for one ordered product? And would the timestamp than would be a measurement or also an additional dimension? Because fact tables just have these two things, as far as I know. Also, would the table then still be summarizable (Flow) by definition?

  2. Using slowly changing dimensions. But then we would need for each entry in the fact table an entry in the dimension table, and this seems not to be the sense of the dimension table as far as I know.

  3. Just changing the Foreign key for the status for the entry in the slowly changing dimension (overwrite the status id in the fact). But as far as I know, it is not allowed/common to change entries in the fact table?

  4. Create an accumulative fact table where the status is tracked with timestamps. But what do I do then with the paid price for the product? This is my main measurement, what I want to measure. Can I simply add it to the accumulative fact table?

You are thinking about this in the wrong way – a dimensional model is designed to meet your reporting requirements and not to mirror what may be happening in your source systems.

So asking what happens when the status changes in the source system is the wrong question. The right question is to ask how does a change of status affect my reporting requirements. Once you answer that, the design of your dimensional model becomes much easier.

For example, you might only want to know the status at the point a fact-generating event occurred; or you might want facts to show the current status regardless of when the fact was created; or you might want to track the order status changes and the dates each change happened.

All these scenarios require different dimensional modelling designs

Leave a Comment