How to remove duplicate records Using Aggregator

How to remove duplicate records Using Aggregator

There are couple of options available in informatica to remove duplicate records from the source. 

For Relational Tables

1. Source Qualifier > ‘SELECT DISTINCT’ option
2. Source Qualifier > SQL override (Write your own Query)

For Flat files or other sources

  1. Sorter > Aggregator
  2. Sorter > Expression > Filter
  3. Sorter > ‘DISTINCT’ option (ONLY to remove duplicate across all ports)

In this session we will cover How to remove duplicate records using Aggregator.

How to remove duplicate records using Aggregator Transformation

Let’s create a mapping for this, To improve Aggregator performance we are going to use Sorter transformation to sort the input records and pass it to Aggregator. Here our source is a Flat file having Job information’s. The file contains duplicate JOB_ID which i need to remove before loading to target table.

How to remove duplicate records using Aggregator
Source Flat File having duplicate JOB_ID

Source – Flat File
Target – Oracle Table
Key port – JOB_ID
Transformations

Sorter   –  To sort the source records (To improve Aggregator Performance) 
Aggregator – To remove the duplicate records by using ‘Group By’ on key port

How to remove duplicate records using Aggregator
Complete Mapping with Sorter & Aggregator
How to remove duplicate records using Aggregator
Have a close look on the transformations, next will see each transformation properties.
How to remove duplicate records using Aggregator
Sorter Transformation – Here we are sorting the record using key JOB_ID
How to remove duplicate records using Aggregator
Aggregator Transformation – To remove the duplicate records just Group By the port JOB_ID which will group all unique records together and pass it to target. Enable Sorted Input to improve the performance of Aggregator Transformation.
How to remove duplicate records using Aggregator
Workflow Log – 4 duplicate records got removed from the source and everything else got loaded in to target.
How to remove duplicate records using Aggregator
Target Table – See the results now, We have only unique JOB_ID in the target table.

In the next post we will see how to remove duplicate records using Expression Transformation.

Leave a Reply

Your email address will not be published. Required fields are marked *