Skip Ribbon Commands
Skip to main content
Navigate Up
Sign In

Quick Launch

Average Rating:

facebook Twitter
Email
Print Bookmark Alert me when this article is updated

Feedback

FAQ: What is failover, detailed steps, and guiding principle?
How does the synchronization happen between Integration Service and Repository Service in case High Availability?
Answer
High availability is an option that eliminates a single point of failure in a domain and provides minimal service interruption in the event of failure. 

High availability consists of the following components:

  • Resilience
    The ability of application services to tolerate transient network failures until either the resilience timeout expires or the external system failure is fixed.
  • Failover
    The migration of an application service or task to another node when the node running the service process becomes unavailable.
  • Recovery
    The automatic completion of tasks after a service is interrupted. Automatic recovery is available for PowerCenter Integration Service and PowerCenter Repository Service tasks. You can also manually recover PowerCenter Integration Service workflows and sessions. Manual recovery is not part of high availability.

 

Resilience

The domain tolerates temporary connection failures between application clients, application services, and nodes. A temporary connection failure might occur because an application service process fails or because of a network failure. When a temporary connection failure occurs, the Service Manager tries to reestablish connections between the application clients, application services, and nodes.

 

Restart and Failover

To maximize operation time in the event of a failure, the Informatica domain can restart or failover processes to another node.

The Service Manager on the master gateway node accepts the application service request and manages the domain. If a master gateway node is not available, the domain shuts down. Configure the domain to failover to another node by configuring multiple gateway nodes.

Based on your license, you can also configure backup nodes for application services. The Service Manager can restart or failover the following application services if a failure occurs:

  • Data Integration Service
  • Model Repository Service
  • PowerCenter Integration Service
  • PowerCenter Repository Service
  • PowerExchange Listener Service
  • PowerExchange Logger Service
  • Resource Manager Service

(a) Domain Failover

The Service Manager on the master gateway node accepts service requests and manages the domain and services in the domain. The domain can failover to another node when the domain has multiple gateway nodes. Configure multiple gateway nodes to prevent domain shutdown when the master gateway node is unavailable.

The master gateway node maintains a connection to the domain configuration repository. If the master gateway node cannot connect to the domain configuration repository, the master gateway node may shut down.

If the domain has multiple gateway nodes and the master gateway node becomes unavailable, the Service Managers on the other gateway nodes elect another master gateway node. The domain tries to connect to the domain configuration repository with each gateway node. If none of the gateway nodes can connect, the domain shuts down and all domain operations fail. When a master gateway fails over, the client tools retrieve information about the alternate domain gateways from the domains.infa file.

 

(b) Application Service Restart and Failover

If an application service process becomes unavailable, the Service Manager can restart the application service or fail it over to a back-up node. When the Service Manager fails over an application service, it starts the service on another node that the service is configured to run on.

The following situations describe how the Service Manager restarts or fails over an application service:

    • If the primary node running the service process becomes unavailable, the service fails over to a back-up node. The primary node might be unavailable if it shuts down or if the connection to the node becomes unavailable.
    • If the primary node running the service process is available, the domain tries to restart the process based on the restart options configured in the domain properties. If the process does not restart, the Service Manager may mark the process as failed. The service then fails over to a back-up node and starts another process. If the Service Manager marks the process as failed, the administrator must enable the process after addressing any configuration problem.

If a service process fails over to a back-up node, it does not fail back to the primary node when the node becomes available. You can disable the service process on the back-up node to cause it to fail back to the primary node.

The following image shows how you can configure primary and back-up nodes for an application service:


Failover.png 


Recovery


Recovery is the completion of operations after an interrupted service is restored. The state of operation for service contains information about the service process. Based on your license, the following components can recover after an interrupted service is restored:

  • Service Manager
    The Service Manager for each node in the domain maintains the state of service processes running on that node. 
  • PowerCenter Repository Service
    The PowerCenter Repository Service maintains the state of operation in the PowerCenter repository. The state of operation includes information about repository locks, requests in progress, and connected clients.
  • PowerCenter Integration Service
    The PowerCenter Integration Service maintains the state of operation in the shared storage configured for the service. The state of operation includes information about scheduled, running, and completed tasks for the service. 
  • Data Integration Service
    The Data Integration Service maintains the state of operation in the Model repository. The state of operation includes the state of the workflow and workflow tasks and the values of the workflow variables and parameters during the interrupted workflow instance. 

 

(a) Configuration for a Highly Available Domain


To minimize system downtime, configure Informatica domain components to be highly available. You can configure the following Informatica domain components to be highly available:

  • Domain
    One node in the domain acts as a gateway to receive service requests from clients and routes them to the appropriate service and node. To prevent domain shutdown when the master gateway node is unavailable, configure more than one gateway node.
  • Nodes
    Informatica services are processes that run on each node. You can configure Informatica services to restart automatically if it terminates unexpectedly.
  • ​Application Services
    The application services run on nodes in the Informatica domain. To minimize the application service downtime, configure backup nodes for application services.

(b) Application Service Failover Configuration


Based on your license, you can configure backup nodes so that application services can failover to another node when the primary node fails. Configure backup nodes when you create or update an application service.

When you configure a backup node, verify that the node has access to run-time files that each application service requires to process data integration tasks such as workflows and mappings. For example, a workflow might require parameter files, input files, or output files.

(c) PowerCenter Integration Service Failover and Recovery Configuration


During failover and recovery, the PowerCenter Integration Service needs to access the state of operation files and process state information. The state of operation files stores the state of each workflow and session operation in the $PMStorageDir directory​. 

Process state information includes information about which node was running the master PowerCenter Integration Service process and which node was running each session. You can configure the PowerCenter Integration Service to store process state information on a cluster file system or in the PowerCenter repository database.


More Information

Applies To
Product: PowerCenter; Data Quality
Problem Type: Stability
User Type: Administrator; Architect; Developer
Project Phase: Configure
Product Version:
Database:
Operating System:
Other Software:

Reference

Attachments

Last Modified Date:7/30/2020 11:32 PMID:623065
People who viewed this also viewed

Feedback

Did this KB document help you?



What can we do to improve this information (2000 or fewer characters)