Skip to main content

Use of the SCD methodology by the Oracle Data Integrator 12

Part 1: Adjusting the validity of the dataset

As described in the previous blog entry, the Oracle Data Integrator (ODI) offers an integrated solution for keeping a history of data with the SCD (slowly changing dimension) methodology. Upon closer consideration and when an integration quantity is loaded practically into a target table using the integration knowledge module (IKM) SCD, it is noticeable that the ODI uses certain default values for the end of the validity period of the dataset. This can be illustrated in the following scenario.

stage-dim-test

Data is written in the dimension table DIM_TEST_SCD from the staging table STAGE_TEST_SCD.

Initial situation

In order to comprehend the initial situation more easily, in the following test cases it is described how the functions can be presented.

Test case 1: Initial loading of 3 datasets on 05/05/2017


initiale-beladung-von-3-datensaetzen-2

 

Three datasets have arrived in the staging table that are to be written into the empty table DIM_TEST_SCD by means of IKM SCD.

 

staging-tabelle-1

 

initiale-beladung-von-3-datensätzen

 

As is seen in the result, the end of the validity period of the dataset is set as standard by the IKM at 01/01/2400 to depict unending validity.

Test case 2: Loading of 3 datasets on 06/05/2017

 

initiale-beladung-von-3-datensätzen

 

The staging table now provides changes. The dataset with NK = 1 remains unchanged (ATTR); for NK = 2, ATTR was changed to “C”; NK = 3 is no longer provided; and NK = 4 was newly added.

 

initiale-beladung-von-5-datensätzen

 

In the case of a change, the original set is terminated in its validity and a new dataset with unending validity is inserted. As can be seen in this example, in the event of termination the validity of the dataset is set at the same value as the start of the validity period upon insertion of a dataset.

Adjusting the validity of the dataset

As has become clear from the examples, the IKM SCD inserts 01/01/2400 as standard for the unending validity and the loading date for the termination of a dataset. However, in many situations it can be desirable to insert a specially defined value for the unending validity or to set the validity of the dataset to “loading date – 1” in the case of termination.

For the validity of the dataset to be set at the desired values, it is necessary to perform minor adjustments in the IKM SCD. For our example, we assume that the date for unending validity is set at 31/12/9999 and the date for a set to be terminated is set at “loading date – 1”. In the ODI, the IKMs can be opened and edited by double-clicking on them in the designer tab under the corresponding project folder. Under the “Tasks” tab, the following tasks can be identified, which need to be adjusted for the desired changes:


staging-tabelle

 

For the tasks in the green boxes, the validity of the dataset is adjusted. To this end, the dots beside the code snippet presented in the “Target Command” area are clicked on.

 

codeausschnitt-1

 

An adjustment was performed on the task “Flag rows for update” as an example above.

For the validity of the dataset to be set to the desired value in the case of a termination as well, the validity must be reduced by one day in the task “Historize old rows”:

 

codeausschnitt-2

 

Following the adjustments, the IKM SCD can be saved and the test cases repeated. After the execution of the ODI mapping in test case 2, the following combinations now result:

 

initiale-beladung-von-5-datensätzen

 

As desired, the validity of the dataset was set to 31/12/9999 and, in the case of a termination, at “loading date – 1”.

In the next blog entry on the topic of “Adjustments to the IKM Slowly Changing Dimensions”, it is highlighted how deletes can also be taken into account in the IKM SCD thanks to further adjustments.