How to Load only Half of the records from Source table

Load only Half of the records from Source table

The requirement is to process only half of the records from the source table. To do this we need to calculate the total number of rows (TOTAL-COUNT) in the source table and then assign ROW-COUNT and exclude the rows if the ROW-COUNT <= (TOTAL-COUNT/2).

Let’s design the mapping for this. We need below transformations to execute this logic.

Expression  : To assign the row count
Aggregator  : To Calculate the total input count (maximum (Row Count))
Joiner           : To Join Aggregator and Expression
Filter             : To process only half of the records

Load only Half of the records from Source table
This is Iconic view of the final mapping, Here we have SQ, Expression, Aggregator, Joiner & Filter transformations between source & Target.
Load only Half of the records from Source table
Just have a close look, how we are connected the ports between transformations.
Load only Half of the records from Source table
EXPRESSION – Added two new ports (row_count) to assign ROW-COUNT , JOIN_KEY is just a dummy key which will be used later in Joiner transformation
Load only Half of the records from Source table
AGGREGATOR – Passed the ROW-COUNT from expression and calculated MAX, this will give you the total number of records in the source, You can also use the COUNT function instead of MAX. Here also we need a dummy JOIN_KEY which will be used in Joiner transformation.
Load only Half of the records from Source table
JOINER – We need Joiner transformation to combine the ports from Expression & Aggregator. Here we will use the assigned dummy Join_key to join both the branch from same source. We also need to enable ‘Sorted Input’ in Joiner Transformation.
Load only Half of the records from Source table
FILTER – At this point we have the ROW-COUNT and the TOTAL-COUNT, Finally using the above filter condition we will process only up to half of the records and the remaining records will be filtered out.
Load only Half of the records from Source table
See the session log, Out of 110 source records only 55 records got loaded into target. If you use Router you can also  load the renaming records to another target table/file (Just like splitting records and writing to multiple targets).

Leave a Reply

Your email address will not be published. Required fields are marked *