How to remove duplicate records Using Aggregator
There are couple of options available in informatica to remove duplicate records from the source.
For Relational Tables
1. Source Qualifier > ‘SELECT DISTINCT’ option
2. Source Qualifier > SQL override (Write your own Query)
For Flat files or other sources
- Sorter > Aggregator
- Sorter > Expression > Filter
- Sorter > ‘DISTINCT’ option (ONLY to remove duplicate across all ports)
In this session we will cover How to remove duplicate records using Aggregator.
How to remove duplicate records using Aggregator Transformation
Let’s create a mapping for this, To improve Aggregator performance we are going to use Sorter transformation to sort the input records and pass it to Aggregator. Here our source is a Flat file having Job information’s. The file contains duplicate JOB_ID which i need to remove before loading to target table.
Source – Flat File
Target – Oracle Table
Key port – JOB_ID
Transformations –
Sorter – To sort the source records (To improve Aggregator Performance)
Aggregator – To remove the duplicate records by using ‘Group By’ on key port
In the next post we will see how to remove duplicate records using Expression Transformation.