Lookup Transformation – My Notes
What is Lookup transformation?
Ans:- It can be used to look up data in a flat file, relational table, source Qualifier, view, or synonym.
What are the various tasks can be performed by Lookup Transformation?
Ans:- Get a related value, Get multiple values, Perform a calculation, Update slowly changing dimension tables.
What are the different types of Lookup?
Ans:- Relational or flat file lookup, Pipeline lookup, Connected or unconnected lookup, Cached or uncached lookup.
What are the Lookup Source Types?
Ans:- A relational table, flat file, or a source qualifier can be used as a lookup source.
Can we query multiple tables in a Relational Lookups?
Ans:- Yes. Override the default SQL statement to add a WHERE clause or to query multiple tables.
Can we use indirect files as lookup sources in a Flat File Lookups?
Ans:- Yes. Use indirect files as lookup sources by configuring a file list as the lookup file name.
How we can improve the performance of a Flat File Lookups?
Ans:- For optimal caching performance use sorted input for the lookup.
In a flat file lookup source, the keys are grouped, but not sorted, What would be the result?
Ans:- The Integration Service can cache the data, but performance may not be optimal.
In a flat file lookup source, if the keys are not grouped, What would be the result?
Ans:- The Lookup transformation returns incorrect results.
How to perform a lookup on an application source that is not a relational table or flat file?
Ans:- Using Pipeline Lookups.
What is the lookup source for a Pipeline Lookups?
Ans:- Source qualifier.
Can we perform pipeline lookups on an Application Multi-Group Source Qualifier transformations?
Ans:- No.
In a Pipeline Lookups, source and source qualifier can be connected to the same pipeline with the Lookup transformation?
Ans:- No. The lookup source and source qualifier are in a different pipeline from the Lookup transformation and it is called a partial pipeline with no target.
What is partial pipeline?
Ans:- It is a separate pipeline from the Lookup transformation which contains lookup source and source qualifier with no target. The partial pipeline is in a separate target load order group in session properties.
How the integration service process the pipeline lookup?
Ans:- The Integration Service reads the lookup source data in the partial pipeline and passes the data to the other pipeline having Lookup transformation to create the cache.
How to improve performance of a Pipeline Lookup transformation?
Ans:- Create multiple partitions in the partial pipeline to improve performance.
Can we configure the target load order for partial pipeline?
Ans:- No, We cannot configure the target load order with the partial pipeline.
What is a connected Lookup transformation?
Ans:- It is a transformation that has input and output ports that you connect to other transformations in a mapping.
What is a Unconnected Lookup transformation?
Ans:- An unconnected Lookup transformation receives input from the result of a :LKP expression in a transformation, It’s not connected to any other transformations.
How an Unconnected Lookup transformation returns the result of the query?
Ans:- Using return port.
What is the most common use for unconnected Lookup transformations?
Ans:- To update slowly changing dimension tables
What would happen If the unconnected Lookup query fails to return a value?
Ans:- The return port receives a null value.
For an unconnected lookup, can we perform the same lookup multiple times in a mapping?
Ans:- Yes, we can call the lookup multiple times with a :LKP expression in a transformation that allows expressions.
What are the common transformation that allows Lookup expression?
Ans:- Aggregator, Expression & Update Strategy transformation.
What are the Lookup Components needs to be configured for a Lookup transformation in a mapping?
Ans:- Lookup source, Ports, Properties, Condition.
How to improve lookup initialization time in a lookup transformation?
Ans:- Add an index to the lookup table and include every column in the lookup condition.
How to improve performance in a Cached lookup?
Ans:- Improve performance by indexing the columns in the lookup ORDER BY.
How to improve performance in a Uncached lookups ?
Ans:- Since the Integration Service issues a SELECT statement for each row passing into the Lookup transformation, you can improve performance by indexing the columns in the lookup condition.
What are the different port types in a Lookup transformation?
Ans:- Input port, Output port, Lookup port & Return port.
What is Lookup Port?
Ans:- It represent the columns of data to return from the lookup source.
What is Return port?
Ans:- Its only for unconnected Lookup transformations, Designates the column of data you want to return based on the lookup condition.
What is associated expression?
Ans:- Its only for dynamic cache lookups and contains the data to update the lookup cache. It can contain an expression or an input port name.
Can we delete lookup ports from a flat file lookup?
Ans:- No. It will cause the session to fail.
Can we delete lookup ports from a relational lookup?
Ans:- Yes, we can only if the mapping does not use the lookup port.
What are the Lookup Source File-type available to configure Flat File Lookups in a Session?
Ans:- Direct & Indirect File Type, Indicates whether the lookup source file contains the source data or a list of files with the same file properties. Choose Direct if the lookup source file contains the source data. Choose Indirect if the lookup source file contains a list of files.
How many cache will be used for Indirect Lookup Source Filetype?
Ans:- The Integration Service creates just ONE cache for all files.
If you use sorted input with indirect files, What would happen If the range of data overlaps?
Ans:- The Integration Service processes the lookup as an unsorted lookup source.
— Lookup Query —
What is Lookup Query?
Ans:- A default query that the integration Service runs for a relational lookup or a pipeline lookup against a relational table.
What are the elements of a Default Lookup Query?
Ans:- SELECT statement and an ORDER BY clause. The Integration Service generates the ORDER BY clause. We cannot view this when we generate the default SQL.
Can we override the ORDER BY Clause in the default Query?
Ans:- Yes, we can suppress the default ORDER BY clause and enter an override ORDER BY with fewer columns to increase performance. If we use pushdown optimization, we cannot override the ORDER BY clause.
How to suppress the default ORDER BY clause?
Ans:- Place two dashes ‘–’ as a comment notation after the ORDER BY clause to suppress the ORDER BY clause that the Integration Service generates.
What are the ports listed in the ORDER BY clause?
Ans:- The ORDER BY clause must contain the condition ports in the same order they appear in the Lookup condition which will help the Integration Service to build the data cache with the keys in sorted order.
What will happen if the lookup query resulted a non sorted records (Query with no ORDER BY)?
Ans:- If the data is not sorted on the keys, you might get unexpected results.
What would be the result If you override the lookup query with an ORDER BY clause without adding comment notation?
Ans:- The lookup fails.
What would be the result If any lookup name or column name contains a database reserved word, such as MONTH or YEAR?
Ans:- The session fails with database errors when the Integration Service executes SQL against the database.
How to resolve of a session fails because of Reserved word in the Lookup query?
Ans:- You can create and maintain a reserved words file, reswords.txt, in the Integration Service installation directory. When the Integration Service initializes a session, it searches the reswords.txt file and places quotes around reserved words, and then executes the SQL against source, target, and lookup databases.
What would be the results if you add or delete any ports from the SELECT statement while overriding the Lookup Query?
Ans:- The session fails. Generate the default query, and then configure the override. This ensures that all the lookup/output ports are included in the query.
Can we use subqueries in the SQL override for uncached lookups?
Ans:- No.
What are the elements When you generate the default query for Uncached Lookup?
Ans:- SELECT statement that includes the lookup and output ports and the WHERE clause based on the lookup condition.
What are the elements When you generate the default query for Uncached unconnected Lookup?
Ans:- SELECT statement includes the lookup ports and the return port. Integration Service does not generate the WHERE clause from the condition that you configure in the condition tab.
— Lookup Condition —
Can a Lookup transformation exist without Lookup Condition?
Ans:- No, You must enter a lookup condition in all Lookup transformations.
How the Integration Service evaluates multiple lookup conditions?
Ans:- The Integration Service evaluates each condition as an AND, not an OR.
How to improve performance in a multiple lookup condition lookup?
Ans:- Follow the condition order Equal to (=), Less than (<), greater than (>), less than or equal to (<=), greater than or equal to (>=), Not equal to (!=) to optimize lookup performance.
Can Integration service matches NULL values in the lookup conditions?
Ans:- Yes, if an input lookup condition column is NULL, the Integration Service evaluates the NULL equal to a NULL in the lookup.
Can we use all operators for dynamic cache enabled Lookup transformation?
Ans:- No, We can use only the equality operator (=) in the lookup condition If you configure a Lookup transformation to use a dynamic cache.
— Multiple Matches —
Should the Integration service support multiple matches in the lookup source?
Ans:- Yes, configure a Lookup transformation to handle multiple matches in the following ways. Use the FIRST or LAST matching value, Use any matching value, Use all values, Return an error.
Can we return multiple rows from an unconnected Lookup transformation?
Ans:- No. We cannot return multiple rows from an unconnected Lookup.
Which value will return if you use ‘Any matching value’?
Ans:- The first value that matches the lookup condition.
What will happen if you ‘Use all values’?
Ans:- The transformation becomes an active transformation. We must configure the transformation to return all matching rows when we create the transformation. We cannot change the property after you create the transformation.
What will happen if you use ‘Return an error’?
Ans:- The Integration Service marks the row as an error and writes the row to the session log by default.
Should the Integration service support multiple matches in Lookup transformation has a dynamic cache?
Ans:- No, The session fails while the Integration Service is caching the lookup table or looking up the duplicate key values.
— Lookup Cache —
Where the Integration Service builds the cache?
Ans:- Builds a cache in memory when it processes the first row of data in a cached Lookup transformation.
What will happen if the memory cache overflowed?
Ans:- The Integration Service also creates cache files by default in the $PMCacheDir and stores the overflow values in the cache files.
What will happen to the cache after the session completes its run?
Ans:- The Integration Service releases cache memory and deletes the cache files unless you configure the Lookup transformation to use a persistent cache.
What is Data & Index cache?
Ans:- The Integration Service stores condition values in the index cache and output values in the data cache.