Sunday, June 12, 2011

How do you perform incremental logic or Delta or CDC?


Incremental means suppose today  we processed 100 records ,for tomorrow  run u need to extract  whatever the records inserted newly and  updated after previous run  based on last updated timestamp (Yesterday run) this process called as incremental or delta.

Approach_1: Using set max var ()
1)First need to create mapping var ($$Pre_sess_max_upd)and assign initial value as old date (01/01/1940).
2)Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_max_upd  (Mapping var)
3)In the expression assign max last_upd_date value to $$Pre_sess_max_upd(mapping var) using set max var
4)Because its var  so it stores the max  last upd_date value in the repository, in the next run  our source qualifier query will fetch only the records updated or inseted after previous run.

Approach_2: Using  parameter file
First need to create mapping  parameter ($$Pre_sess_start_tmst )and assign initial value as old date (01/01/1940) in the parameterfile.
2 Then override source qualifier query to fetch only LAT_UPD_DATE >=$$Pre_sess_start_tmst (Mapping var)
3 Update mapping parameter($$Pre_sess_start_tmst) values in the parameter file using shell script or another mapping after first session get completed successfully
4 Because its mapping parameter   so every time we need to update the  value in the  parameter file after comptetion of main session.

Approach_3: Using oracle Control tables
1     First we need to create  two control tables cont_tbl_1 and cont_tbl_1 with structure of session_st_time,wf_name
2   Then insert  one record  in each table with session_st_time=1/1/1940 and workflow_name
3       create two store procedures one for update cont_tbl_1 with session st_time, set  property of  store procedure type as Source_pre_load  .
4 In  2nd store procedure  set property of store procedure type as Target _Post_load.this proc will update the session _st_time  in Cont_tbl_2 from cnt_tbl_1.
5 Then override source qualifier query to fetch only LAT_UPD_DATE >=(Select  session_st_time from cont_tbl_2 where workflow name=’Actual work flow name’.
SCD Type-II Effective-Date Approach
·        We have one of the dimension in current project called resource dimension. Here we are maintaining the history to keep track of SCD changes.
·        To maintain the history in slowly changing dimension or resource dimension. We followed SCD Type-II Effective-Date approach.
·        My resource dimension structure would be eff-start-date, eff-end-date, s.k and source columns.
·        Whenever I do a insert into dimension I would populate eff-start-date with sysdate, eff-end-date with future date and s.k as a sequence number.
·        If the record already present in my dimension but there is change in the source data. In that case what I need to do is
Update the previous record eff-end-date with sysdate and insert as a new record with source data.

No comments:

Post a Comment