Skip Ribbon Commands
Skip to main content
Navigate Up
Sign In

Quick Launch

Average Rating:

facebook Twitter
Email
Print Bookmark Alert me when this article is updated

Feedback

Using Transaction Control to Restart Failed PowerCenter Sessions
Content

When working with PowerCenter, sometimes a session may fail. Sessions can fail as the result of reader, writer, or transformation threshold errors, or as the result of a database or system failure. If the session issued at least one commit before it failed, you might want to recover the session. You can develop mapping and workflow logic that allows you to restart a failed session and continue committing data from the point the previous session failed. This allows you to achieve session results that are the same as if the session never failed.

You can implement session recovery using the transaction control feature. First, you must create a load control table on the target database that can store load status information between session runs. Then, you must add logic to your mapping that updates the information in the load control table as the Integration Service commits data to the target; if the session fails, this mapping logic also allows the Integration Service to determine the last row committed to the target and resume committing from that row on subsequent session runs.

Finally, you must create a second mapping and session that resets the load control table. Apply a condition to the workflow link connecting the two sessions to ensure the Integration Service resets the load control table only after it successfully commits all data to the target.

This article describes each mapping, session, and workflow you must design to implement recovery in detail. You can add the mapping and workflow logic in these examples to existing mappings and workflows.

You can also download a copy of the mappings, sessions, and the workflow in XML format. You can import these objects into your repository. The attachment below contains the exported objects.

Note:

You may want to review the session recovery documentation to determine the most effective way for you to implement recovery.

Designing the Load Control Mapping

The load control mapping uses a series of transformations and expressions to update the load control table each time the Integration Service commits a transaction to the target table. On subsequent session runs, the mapping logic also obtains target load information from the load control table and filters previously committed rows. Figure 1 shows an iconic view of the sample mapping, m_LoadOrders:

Figure 1. Iconic View of Sample Load Control Mapping

m_loadOrders_iconic.jpg

Table 1 describes the transformations in the mapping.

Table 1. m_LoadOrders Summary

Transformation Name

Transformation Type

Description

seq_GenRowID

Sequence Generator

Generates row ID number for each source row.

exp_GetLastRowLoaded

Expression

Determines the ID number of the last row committed to the target table.

lkp_LOAD_CTRL

Unconnected Lookup

Looks up table name in load control table and returns ID number of last row committed to target.

fil_AlreadyLoaded

Filter

Filters rows already committed to the target.

exp_CountRowsInTX

Expression

Defines transactions by counting rows. Determines when the Integration Service reaches the end of a transaction.

tc_CommitTransaction

Transaction Control

Commits rows in transaction when end of transaction is reached. Commits data to both target table and the load control table.

fil_AllButLastRowInTX

Filter

Filters all but the last row in the transaction.

exp_SetStatus

Expression

Adds the name of the target table and the status of the load to the row ID of the last row in the transaction.

upd_LOAD_CTRL

Update Strategy

Updates information in the load control table.

When the Integration Service runs a session that uses this mapping, it performs the following tasks:

  1. The Integration Service generates a row ID number for each row read from the source.
  2. If the session was restarted after a failed session, the Integration Service retrieves the ID number of the last row committed to the target from the information in the load control table.
  3. The Integration Service filters rows previously committed to the target. This ensures that the Integration Service does not write duplicate information to the target.
  4. The Integration Service establishes target load checkpoints by grouping a specified number of rows into transactions.
  5. When a transaction is complete, the Integration Service commits the transaction to both the target and the load control table. This ensures that the data in the load control table reflects the number of rows successfully committed to the target.
  6. The Integration Service updates the load control table with the ID number of the last row committed to the target and the status of the target load.

Configuring Source and Target

In the sample mapping, m_LoadOrders, the Integration Service reads from a relational source table named ORDERS. For the purposes of this example, the Integration Service does not transform the source data and writes it to another relational ORDERS table. The ORDERS source and target tables use the following format:

Table 2: Source and Target Table Format

Column Name

Datatype

ORDER_ID

NUMBER

DATE_ENTERED

DATE

DATE_PROMISED

DATE

DATE_SHIPPED

DATE

EMPLOYEE_ID

NUMBER

CUSTOMER_ID

NUMBER

SALES_TAX_RATE

NUMBER

STORE_ID

NUMBER

Sorting Source Data

To ensure consistent recovery of data, you must ensure that the Integration Service processes and commits source rows in the same order each time you run the session. To accomplish this task, configure the Sorted Ports option in the Source Qualifier transformation.

Creating the Load Control Table

In order to restart and recover data from a failed session, you need to accurately and reliably store information about the failed session. You specifically need to know how many rows the failed session committed to the target before it failed. Subsequent session runs can use this information to determine which row to continue the target load from.

You can use a load control table for this task. The load control table contains information about the number of rows committed during the session run and the status of the target load. Later, you will configure the mapping so that the Integration Service updates the information in the load control table each time it commits data to the target.

Use the Warehouse Designer to design a target, LOAD_CTRL, for storing target load status information. After you design the target, you can use the Designer to create the table on the target database. Table 3 describes the format of the target definition.

Table 3: Load Control Target Format

Column Name

Datatype

Description

TableName

VARCHAR

The name of the target table.

LastRow

NUMBER

The ID number of the last row committed to the target. The Integration Service updates this value each time it commits data to the target.

LoadStatus

VARCHAR

The status of the target load. ACTIVE status indicates the load is in progress. DONE status indicates the target load and session succeeded.

Configure the TableName column as the primary key.

Generating Row ID Numbers

Now that you have a table you can use to record the last row committed to the target table, you also need to identify the source rows. To accomplish this, use a Sequence Generator transformation to generate a unique sequence value for each row the Integration Service reads from the source. The Integration Service can use the sequence values as row ID numbers.

Create a Sequence Generator transformation, seq_GenRowID, with the following properties:

Table 4: Sequence Generator Properties

Property

Value

Start Value

0

Increment By

1

Current Value

1

Reset

Enabled

NEXTVAL serves as the row ID number.

Getting the Last Committed Row

The first time you run the session, the target is empty. If you restart a failed session, however, the Integration Service may have committed a number of rows to the target before the previous session failed. To continue committing data from the point of failure, you need to determine the last row the Integration Service committed to the target.

To achieve this task, use an expression containing a lookup function to retrieve information from the load control table. The lookup function sends the name of the target table to an unconnected Lookup transformation, which returns the ID number of the last row committed to the target. Figure 2 illustrates this section of the mapping.

Figure 2. Mapping Logic for Determining Last Committed Row

GetLastRowLoaded_logic.jpg

Expression Transformation

Create an Expression transformation, exp_GetLastRowLoaded. The transformation receives rows from the source. The transformation also contains three new ports: RowID, LastRow, and varLastRow. Figure 3 illustrates the port configuration of exp_GetLastRowLoaded.

Figure 3. exp_GetLastRowLoaded Port Configuration

expGetLastRowLoaded_ports.jpg

RowID receives the value for NEXTVAL from seq_GenRowID. This associates a row ID number with each source row.

LastRow outputs the value of varLastRow.

varLastRow uses the following expression to determine the value of LastRow based on the information in the load control table:

IIF(RowID = 1, IIF(isnull(:LKP.LKPLOAD_CTRL('ORDERS')), 0,:LKP.LKPLOAD_CTRL('ORDERS')),varLastRow)

For the first row the Integration Service reads from the target (when RowID=1,because the sequence generator resets for each run ), the expression uses a :LKP expression to pass the name of the target table, ORDERS, to the Lookup transformation lkp_LOAD_CTRL. If the result of the :LKP expression is null, as it would be for the first run of the session, the Integration Service returns a value of 0 for varLastRow. If session commit information for ORDERS exists in the load control table, the :LKP expression returns the ID of the last row committed to the target table.

Lookup Transformation

lkp_LOAD_CTRL performs a lookup on the load control table LOAD_CTRL. In this example, lkp_LOAD_CTRL performs a lookup on the TableName column in the control table and returns the value In the LastRow column. When you create the lkp_LOAD_CTRL, use LOAD_CTRL as the lookup table. Figure 4 illustrates the port configuration of lkp_LOAD_CTRL.

Figure 4. lkp_LOAD_CTRL Port Configuration

lkpLoadCtrl_ports.jpg

Add the input/output port inTableName to the transformation. Configure TableName, LastRow, and LoadStatus as lookup ports. Also, configure LastRow as the return port, since lkp_LOAD_CTRL is unconnected.

On the Condition tab of the transformation properties, create the following lookup condition:

TableName = inTableName

Filtering Previously Committed Rows

After configuring the mapping to assign each source row an ID number, and determining the last row committed to the target in a previous target, we must prevent any rows that are already committed to the target from being committed again. The first time you run the session, the Integration Service will not filter any rows. If you restarted the session after it failed, however, the Integration Service will filter out rows that are already committed to the target.

We can use a Filter transformation to perform this task. Create a Filter transformation fil_AlreadyLoaded. Connect all ports in exp_GetLastRowLoaded to fil_AlreadyLoaded. You want to filter out rows where the row ID number is less than the ID number of the last row committed to the target. Create the following filter condition:

IIF(RowID <= LastRow, FALSE, TRUE)

Defining Transactions

After filtering previously committed rows, your next task is to create a mechanism the Integration Service can use to consistently commit source rows to the target and target load status information to the load control table. You can accomplish this by defining transactions and then controlling when the transactions are committed to the target and load control tables.

Define your transactions according to a set number of source rows, in this case five. Each time the Integration Service processes five rows, it commits the rows to the target and updates the ID number of the last committed row in the load control table. Use an Expression transformation to count the number of rows in the current transaction.

Create the Expression transformation exp_CountRowsInTX. Figure 5 illustrates the port configuration of exp_CountRowsInTX.

Figure 5. exp_CountRowsInTX Port Configuration

expCountRowsInTX_ports.jpg

Connect the source data ports and the RowID port from fil_AlreadyLoaded to the transformation.

To count the number of rows in the transaction, create a variable port called varRowCount. It uses the following expression to count the number of rows in the current transaction:

IIF ( varRowCount> $$CommitRowCount, 1, varRowCount + 1)

$$CommitRowCount is a mapping parameter of type integer with an initial value of 5. For each row passing through the transformation, the local variable varRowCount increases by one until it exceeds the value specified for $$CommitRowCount (five in this example). If varRowCount exceeds $$CommitRowCount, the Integration Service assigns varRowCount a value of 1.

Create an output port, EndTrans, that signals the end of the transaction. When varRowCount is 1, the transaction is complete. Create the following expression:

IIF (varRowCount = 1, TRUE, FALSE)

After counting off the rows in the transaction, we can commit the transaction.

Tip: This example defines transactions as a group of five rows. You can increase the size of the transaction by increasing the initial value of the mapping parameter $$CommitRowCount.

Committing Transactions

After defining the transaction, use a Transaction Control transformation to commit the transactions to the target and the load control table. Create the Transaction Control transformation tc_CommitTrans to perform this task.

tc_CommitTrans receives input from all output ports in exp_CountRowsInTX. It uses the following transaction control condition:

IIF(EndTrans, TC_COMMIT_BEFORE, TC_CONTINUE_TRANSACTION)

When EndTrans is true, the Integration Service commits the transaction, begins a new transaction, and writes the current row to the target; the current row is in the new transaction. If EndTrans is FALSE, the Integration Service does not perform any transaction change for the row.

The rows in tc_CommitTrans that contain source data connect directly to the ORDERS target. The output ports RowID and EndTrans connect to the Filter transformation fil_AllButLastRowInTX.

Updating the Load Control Table

We can now use a series of transformations to update the load control table with status information related to the target. The mapping uses a Filter transformation to filter all rows except the last row in the committed transaction. An Expression transformation then appends the name of the target table and the load status to the row ID number, and then an Update Strategy transformation updates the information in the load control table.

Filtering the Last Row in the Transaction

The Filter transformation fil_AllButLastRowInTX receives row ID (RowID port) and transaction information (EndTrans port), and filters all rows except those where EndTrans is TRUE. fil_AllButLastRowInTX uses the following filter condition:

IIF (EndTrans, TRUE, FALSE)

The RowID port in fil_AllButLastRowInTX connects to exp_SetStatus.

Appending Table Name and Load Status to the Transaction

In this example, the Expression transformation exp_SetStatus receives the row ID number of the last row in the transaction. exp_SetStatus also contains the output ports TableName of type String(20,0) and LoadStatus of type String(10,0). These output ports append the name of the target table and the load status to the row ID. In this example, exp_SetStatus appends table name ORDERS and load status ACTIVE to the row ID number for each row.

The load status 'ACTIVE' indicates that a target load is in progress.

All output ports in exp_SetStatus connect to the Update Strategy transformation upd_LOAD_CTRL.

Updating the Load Control Table

The Update Strategy transformation upd_LOAD_CTRL connects to the load control target and updates the information in the load control table. upd_LOAD_CTRL uses the following Update Strategy Expression:

DD_UPDATE

Also, when you configure the session, configure the update strategy for the LOAD_CTRL on the Targets tab of the session properties to Update Else Insert. If the row does not exist in the table, then the Integration Service will insert the row.

The Integration Service updates the load status information related to the target in the load control table.

If the load control session succeeds, you no longer need the target load information in the load control table for the target. You can design a second mapping that resets the information in the load control table. Later, you can configure the workflow to run the second session when all data is committed to the target.

Designing the Load Control Reset Mapping

If the load control session succeeds, you no longer need the target load information in the load control table for the target. You can design a second mapping that resets the information in the load control table. Later, you can configure the workflow to run the second session when all data is committed to the target. Figure 6 shows an iconic view of the mapping.

Figure 6. Iconic View of Load Control Reset Mapping

m_UpdateLoadControl_iconic.jpg

Table 5 describes the transformations in the load control reset mapping, m_ResetLoadControl.

Table 5. m_ResetLoadControl Summary

Transformation Name

Transformation Type

Description

exp_ResetStatus

Expression

Changes the ID of the last row committed to the target to zero and the load status to DONE.

upd_LOAD_CTRL

Update Strategy

Updates information in the load control table.

In this example, the mapping reads data from the TABLENAME, LASTROW, and LOADSTATUS columns in the load control table LOAD_CTRL. The Source Qualifier transformation SQ_LOAD_CTRL uses the following source filter to read only rows where ORDERS is the table name:

LOAD_CTRL.TABLENAME = 'ORDERS'

The TABLENAME port in SQ_LOAD_CTRL connects to exp_ResetStatus. exp_ResetStatus contains the output port LastRow of type Decimal(15,0). LastRow outputs the following expression:

0

exp_ResetStatus also contains the output port LoadStatus of type String. LoadStatus outputs the following expression:

'DONE'

All ports in exp_ResetStatus connect to the Update Strategy transformation upd_LOAD_CTRL.

upd_LOAD_CTRL connects to the load control target definition and updates the information in the load control table. upd_LOAD_CTRL uses the following Update Strategy Expression:

DD_UPDATE

Also, when you configure the session, configure the update strategy for the LOAD_CTRL on the Targets tab of the session properties to Update Else Insert. If the row does not exist in the table, then the Integration Service will insert the row.

The Integration Service updates the load status information related to the target in the load control table. In this example, for the table ORDERS the last row committed to the target is reset to 0 and the load status indicates DONE.

Configuring the Workflow

After designing both mappings, create a session that uses each mapping. When you link the sessions in the workflow, apply a condition to the link so that the Integration Service runs the load control reset session only if the first session succeeds.

In this example, the workflow w_RestartOrders contains the sessions s_m_LoadOrders and s_m_ResetLoadControl. Figure 6 shows an iconic view of the workflow:

Figure 6 w_RestartOrders Workflow Configuration

w_mRestartOrders.jpg

The link that connects s_m_LoadOrders and s_m_ResetLoadControl uses the following link condition:

$s_m_LoadOrders.Status = SUCCEEDED

The Integration Service executes the link and runs s_m_ResetLoadControl only if s_m_LoadOrders succeeds. If s_m_LoadOrders fails, the workflow completes but does not run s_m_ResetLoadControl. This allows you to restart s_m_LoadOrders and recover the target before resetting the information in the load control table.

Adapting Existing Mappings

This article described how to design a series of transformations that, when incorporated in a mapping, update a load control table with status information about the last row committed to the target table. If the session fails, the Integration Service uses this mapping logic to continue committing data to the target from the point where the previous session failed. This article also described how to design a second mapping and workflow link that resets the load control table after the Integration Service successfully commits all data to the target.

You can add the mapping and workflow logic described in the example mappings and workflow to existing sessions. When you add the load control mapping logic to an existing mapping, insert the load control transformations after the last existing transformation and before the target definition. Also, add the mapping parameter $$CommitRowCount to the mapping.

Since you will also need to edit the expressions described in this article to reflect the name of your existing target table, you can apply the load control mapping logic to one target only. If your mapping contains multiple targets, you must duplicate the transformations and expressions for each target in the mapping. Multiple target tables on the same target database can store load control status information in the same load control table, however.

You must also add a session that uses the load control reset mapping and the link condition described above to your existing workflow. If your mapping contains multiple targets and you developed load control transformations for each target, you must also edit the source filter in SQ_LOAD_CTRL to filter for each target table name. Further, if you are using multiple load control tables, you must also link a load control reset session for each load control table to the existing session.



More Information


Reference


Attachments


Applies To
Product: PowerCenter
Problem Type:
User Type: Administrator
Project Phase:
Product Version: PowerCenter
Database:
Operating System:
Other Software:

Last Modified Date: 9/29/2009 4:58 AM ID: 15149
People who viewed this also viewed

Feedback

Did this KB document help you?



What can we do to improve this information (2000 or fewer characters)