Split the non-key columns to separate tables with key column in both
Scenario 2:
Split the non-key columns to separate tables with key column in both / How to split the data of source table column-wise with respect to primary key. See the source and target tables below.
source table: ID is the key column, Name and Phone No are non-key columns
ID | Name | Phone No |
10 | AAA | 123 |
20 | BBB | 234 |
30 | CCC | 434 |
40 | DDD | 343 |
50 | EEE | 442 |
Target Table 1
ID | Name |
10 | AAA |
20 | BBB |
30 | CCC |
40 | DDD |
50 | EEE |
Target Table 2
ID | Phone No |
10 | 123 |
20 | 234 |
30 | 434 |
40 | 343 |
50 | 442 |
Solution:
Step 1: Source qualifier: get the source table to the mapping area. See image below.
For example the file content looks like as below:-
some Header here
col1 col2 col3 col4
data1 data2 data3 data4
data5 data6 data7 data8
data1 data2 data3 data4
data1 data2 data3 data4
footer
Step1: Drag the source to mapping area.
Step3: In filter write the condition like in the picture
Step 2: Drag all the port from (from the previous step) to the Aggregator transformation and group by the key column. Since we have to split the columns to two different tables with the key column in each, so we are going use two expression transformation, each will take the key column and one non-key column. Connect aggregator transformation with each of the expression transformation as follows.
Step 3: We need another set of aggregator to be associated with each of the expression tranformation from the previous step.
Step 4: In the final step connect the aggregators with the two target tables as follows.
Here is the iconic view of the entire mapping.
Separating duplicate and non-duplicate rows to separate tables
Scenario 3:
How to segregate the duplicate and distinct rows from source table to separate target tables?
source table:
COL1 | COL2 | COL3 |
a | b | c |
x | y | z |
a | b | c |
r | f | u |
a | b | c |
v | f | r |
v | f | r |
Target Table 1: Table containing all the unique rows
COL1 | COL2 | COL3 |
a | b | c |
x | y | z |
r | f | u |
v | f | r |
Target Table 2: Table containing all the duplicate rows
COL1 | COL2 | COL3 |
a | b | c |
a | b | c |
v | f | r |
Solution:
Step 1: Drag the source to mapping and connect it to an aggregator transformation.
Step 2: In aggregator transformation, group by the key column and add a new port call it count_rec to count the key column.
Step 3: connect a router to the aggregator from the previous step.In router make two groups one named "original" and another as "duplicate"
In original write count_rec=1 and in duplicate write count_rec>1.
In original write count_rec=1 and in duplicate write count_rec>1.
The picture below depicting group name and the filter conditions
Step 4: Connect two group to corresponding target table.
Remove footer from your file
Scenario 5: How to remove footer from your file ?
For example the file content looks like as below:-
some Header here
col1 col2 col3 col4
data1 data2 data3 data4
data5 data6 data7 data8
data1 data2 data3 data4
data1 data2 data3 data4
footer
Just we have to remove footer from the file.
Solution:
Step1: Drag the source to mapping area.
Step2: After that connect a filter or router transformation.
Step3: In filter write the condition like in the picture
Scenario6:
Get the first top five salaries with out using rank transformation
Step 4:Finally pass it over to target.
Remove header from your file
Scenario 6: How to remove header from a file ?
Solution
Step1: After mapping go to workflow and scheduled it.
Step2: Just double click on the session and go to mapping option.
Step2: Just double click on the session and go to mapping option.
Step3: Select the source and go to the set file properties.
flat file properties
Step4:Chose the advance option. Set number of initial rows skip: 1 ( it can be more as per requirement )
Aggregating rows on group count basis
Scenario 13: There are 4 departments in Emp table. The first one with 100,2nd with 5, 3rd with 30 and 4th dept has 12 employees. Extract those dept numbers which has more than 5 employees in it, to a target table.
Solution:
Put the source to mapping and connect the ports to aggregator transformationMake 4 output ports in aggregator as in the picture above : count_d10, count_d20, count_d30, count_d40.
For each port write expression like in the picture below.
Then send it to expression transformation
In expression make four output ports (dept10, dept20, dept30, dept40) to validate dept no
And provide the expression like in the picture below.
For each port write expression like in the picture below.
Then send it to expression transformation
In expression make four output ports (dept10, dept20, dept30, dept40) to validate dept no
And provide the expression like in the picture below.
5. Then connect to router transformation. And create a group and fill condition like below.
6. Finally connect to target table having one column that is dept no.
6. Finally connect to target table having one column that is dept no.
Excellent sir....
ReplyDelete