How to remove duplicate records using Sorter Transformation

How to remove duplicate records using Sorter Transformation

In the last two post we have seen how to remove duplicate records (Only one column) using Aggregator and Expression. We have identified the duplicate JOB_ID and skipped those rows from loading to target.

The same logic CANNOT be done using Sorter Transformation. In sorter we cannot check duplicate against a single column. If you enable ‘DISTINCT’ option it will automatically treat all the columns as the key fields and then check for the duplicates.

The below two records are duplicate in terms of first column (JOB_ID). Sorter transformation couldn’t check for duplicate against only one column, we may need to use either Aggregator or expression as we did in the previous post to remove the duplicates.

SA_REP Sales-Representative 6000 12008
SA_REP Sales-Rep                     4000 10000

The below two records are duplicate in terms of all the columns. Sorter transformation CAN be used here to remove these duplicate records by using ‘DISTINCT’ option.

SA_REP Sales-Representative 6000 12008
SA_REP Sales-Representative 6000 12008

Let’s see the mapping for this …

Source – Flat File
Target – Oracle Table
Transformations – Sorter  (With DISTINCT)

How to remove duplicate records using Aggregator
Source Flat File having duplicate records
How to remove duplicate records using Sorter Transformation
Complete Mapping with Sorter Transformation
How to remove duplicate records using Sorter Transformation
Have a close look on the transformations and understand how we connected the ports.
How to remove duplicate records using Sorter Transformation
Sorter – Enable ‘DISTINCT’ option which will automatically treat all columns as Key fields.
How to remove duplicate records using Sorter Transformation
Workflow Log – 4 duplicate records got removed from the source and everything else got loaded in to target.
How to remove duplicate records using Sorter Transformation
Target Table – See the results now, We have only unique records in the target table.

Leave a Reply

Your email address will not be published. Required fields are marked *