Aggregator Transformation – My Notes
What is the use of Aggregator Transformation?
Ans:- Aggregator transformation can be used to aggregate calculations (Avg, Sum, Max ..) on groups. Since its changing the number of rows its an active transformation.
What are the Components of the Aggregator Transformation?
Ans:- Aggregate cache, Aggregate expression, Group by port & Sorted input.
Where the Integration Service creates the index and the data caches?
Ans:- It creates the index and the data caches in memory to process the transformation. If the Integration Service requires more space, it stores overflow values in cache files.
Can we use aggregate expressions in all transformations which support an Expression?
Ans:- No, The Designer allows aggregate expressions only in the Aggregator transformation.
What are the aggregate functions within an Aggregator transformation?
Ans:- AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE
Can we use Nested Aggregate Functions?
Ans:- Yes, we can include multiple single-level or multiple nested functions in different output ports in an Aggregator transformation.
Can we include both single-level and nested functions in an Aggregator transformation?
Ans:- No, if an Aggregator transformation contains a single-level function in any output port, you cannot use a nested function in any other port in that transformation. When you include single-level and nested functions in the same Aggregator transformation, the Designer marks the mapping or mapplet invalid.
How to handle Null Values in Aggregate Functions?
Ans:- When you configure the Integration Service, we can choose to treat null values in aggregate functions as NULL or zero. By default, the Integration Service treats null values as NULL in aggregate functions.
— Group By Ports —
What would be the result when you grouping data in Aggregator transformation?
Ans:- Aggregator transformation outputs the last row of each group When we grouping data (Enable Group By Port).
What would be the result If you do not group values (No Group By Ports)?
Ans:- The Integration Service returns one row for all input rows, typically returns the last row.
Can we specify a particular row to be returned instead of last row if using Group By Ports?
Ans:- Yes, by using the FIRST/LAST functions.
Can we use Non-Aggregate Expressions in Aggregator transformation?
Ans:- Yes we can (Eg: IIF( ITEM = ‘AAA battery’, battery, ITEM ))
What is an Incremental Aggregation?
Ans:- It’s a session option to make Integration Service to performs incremental aggregation, it passes source data through the mapping and uses historical cache data to perform aggregation calculations incrementally.
— Sorted Input —
How can we improve the performance of an Aggregator transformation?
Ans:- Use the sorted input option and pass the sorted records to Aggregator transformation.
What would be the result If you use sorted input but do not presort data correctly?
Ans:- We may receive unexpected results or the session fails.
What are the scenarios we cannot use sorted input?
Ans:- (1) If the aggregate expression uses nested aggregate functions, (2) If the session uses incremental aggregation, (3) If Source data is data driven.
If the input is not sorted how the Integration Service performs aggregate calculations?
Ans:- Since the data is not sorted, the Integration Service stores data for each group until it reads the entire source to ensure all aggregate calculations are accurate.
Do we need to configure cache memory for Aggregator transformations that use sorted ports?
Ans:- No, The Integration Service uses memory to process an Aggregator transformation with sorted ports. It does not use cache memory.
How to configure the port sort order in sorted input Aggregator transformation?
Ans:- Group by columns in the Aggregator transformation must be in the same order as they appear in the Sorter transformation.
What would be the result if the session sort order is not matching with the input sort order?
Ans:- The Integration Service fails the session, For example, if you configure a session to use a French sort order, data passing into the Aggregator transformation must be sorted using the French sort order.
Should the Aggregator transformation provide sorted output?
Ans:- No, To sort output from an Aggregator transformation, use a Sorter transformation.
— Performance Tips —
(1) Sorted input reduces the amount of data cached during the session and improves session performance. (2) Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache. (3) Use a Filter transformation to filter out records before the Aggregator transformation to reduce unnecessary aggregation.