Skip Ribbon Commands
Skip to main content
Navigate Up
Sign In

Quick Launch

Average Rating:

facebook Twitter
Email
Print Bookmark Alert me when this article is updated

Feedback

ERROR: "com.informatica.dtm.transport.DTFUncheckedException: [DtmDTF_0001] Data Transport Error" during pushdown mapping execution in Blaze Engine using Informatica DEI
Problem Description


While running Hadoop pushdown mappings in Blaze Engine using Informatica 'Data Engineering Integration' (DEI), earlier known as 'Big Data Management' (BDM), mapping execution fails with the following error message in the logs:

 

Mapping Run log

 

2018-03-12 09:30:37 <TASK_139731211790080-WRITER_1_*_1> INFO: [WRT_8003] Writer initialization complete.
2018-03-12 09:38:55 <TASK_139731264239360-READER_1_1_1> SEVERE: [_0] com.informatica.dtm.transport.DTFUncheckedException: [DtmDTF_0001] Data Transport Error,

Origin :[Pipe :: receiveMessage].

 

Blaze Grid Manager Log

 

2018-03-12 09:36:27.352 <AMRM Callback Handler Thread> INFO: Container completion status: id [container_e40_1520440847746_0012_01_000008];

state [COMPLETE]; diagnostics [Container [pid=10802,containerID=container_e40_1520440847746_0012_01_000008] is running beyond physical memory limits. 

Current usage: 2.0 GB of 2 GB physical memory used; 3.7 GB of 4.2 GB virtual memory used. Killing container.

 

Dump of the process-tree for container_e40_1520440847746_0012_01_000008 :

|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE

|- 10802 10799 10802 10802 (bash) 0 1 116170752 429 /bin/bash -c /data08/yarn/nm/usercache/informatica/appcache/application_1520440847746_0012/container_e40_1520440847746_0012_01_000008/infa_rpm.tar/

services/shared/hadoop/cloudera_cdh5u10/scripts/infagjlauncher.sh -N OOP_Container_Manager_Service_workernode03.informatica.com -E

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

]; exit status [-104]..

2018-03-08 11:34:27.355 <AMRM Callback Handler Thread> SEVERE: Service [OOP_Container_Manager_Service_3] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_195] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_194] has stopped running..​​​​​​

Cause

 

The encountered 'Out of Memory'  issue has occurred for the 'OOP Container Manager' (also known as 'DTM Process Manager' ) service component of the Blaze Engine. Because of the abrupt shutdown of 'OOP Container Manager'  service component, Blaze Engine got shutdown and the jobs were failed. For more information on the Blaze Engine architecture, refer to the following document:


​https://docs.informatica.com/big-data-management/data-engineering-integration/10-4-0/administrator-guide/introduction-to-data-engineering-administration/hadoop-integration/run-time-process-on-the-blaze-engine.html

 

By default, 'OOP Container Manager' component would be using 1 GB YARN container for its execution and the 'Out of Memory' issue occurs, when more concurrent DTM processes got created during the mapping execution. 


Solution


Perform the following steps to re-configure the memory for 'OOP Container Manager' service component of Blaze Engine, depending on the Informatica DEI version:

 

From Informatica 10.2.1 version:

 

  1. Log in to Informatica Administrator console or launch Informatica Developer client.
  2. Navigate to 'Connections' tab in case of Informatica Administrator console and 'Windows > Preferences > Connections > [Domain]> Cluster', when Developer client is used.
  3. Select the 'Hadoop Pushdown' connection being used for running the jobs.
  4. Navigate to 'Blaze Engine' section.
  5. Edit the 'Advanced Properties' attribute in the section.
  6. Add the following attribute under 'Advanced Properties' section, to increase the memory for 'OOP Container Manager' service component using 'New' button​:

  

infagrid.container.mgr.memory

<new_memory_value_in_MB>

 

For instance, 

 

infagrid.container.mgr.memory

3072


infa_1021_hadoop_connection_blaze_properties.jpg

 

 

   ​                   7. Save the changes made to the Hadoop connection. 

 

Pre-Informatica 10.2.1 versions:

 

  1. Log in to Informatica Server machine.
  2. Navigate to '$INFA_HOME/services/shared/hadoop/[distribution]/infaConf' location.
  3. Update the 'hadoopEnv.properties' file, by adding to increase the memory for 'OOP Container Manager' service component:

  

infagrid.container.mgr.memory=<new_memory_value_in_MB>

 

For instance,

 

infagrid.container.mgr.memory=3072

 

      4.    Once added, save the changes made to the file.

 

Note:

    • In case of multi-node setup, ensure that property is added to 'hadoopEnv.properties' file in all the nodes, primarily in the node where DIS used for mapping execution is running.
    • Property would be picked up automatically during mapping execution and it is not required to recycle the DIS for the changes to take effect.
    • The configuration - 'infagrid.container.mgr.memory' - can be added after the existing configuration - 'infagrid.def.max.memory'  in the 'hadoopEnv.properties'  file.

 


After making the mentioned configuration changes, perform the following:

  • Stop the Blaze Engine if running. For information on stopping Blaze Engine, refer to the following KB article: 

https://kb.informatica.com/howto/6/Pages/20/521162.aspx 

  • ​​Re-run the mapping in 'Blaze' mode and it should complete successfully. ​


More Information

'YARN' service would not entertain the memory requests of any application running in Hadoop cluster (including that of Blaze Grid Manager, which would be run as an application) when its memory limit exceeds. Following are the properties of 'YARN' service, which handles the same.
 

yarn.nodemanager.pmem-check-enabled

yarn.nodemanager.vmem-check-enabled

yarn.nodemanager.vmem-pmem-ratio

 

For more information on the above properties, check here

​​application_1520440847746_0012__Blaze_Grid_Manager_Log (Earlier Application)

 

2018-03-08 11:34:27.352 <AMRM Callback Handler Thread> INFO: Container completion status: id [container_e40_1520440847746_0012_01_000008]; state [COMPLETE]; 

diagnostics [Container [pid=10802,containerID=container_e40_1520440847746_0012_01_000008] is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory 

used; 3.7 GB of 4.2 GB virtual memory used. Killing container.

 

Dump of the process-tree for container_e40_1520440847746_0012_01_000008 :

|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE

|- 10802 10799 10802 10802 (bash) 0 1 116170752 429 /bin/bash -c /data08/yarn/nm/usercache/informatica/appcache/application_1520440847746_0012/

container_e40_1520440847746_0012_01_000008/infa_rpm.tar/services/shared/hadoop/cloudera_cdh5u10/scripts/infagjlauncher.sh 

-N OOP_Container_Manager_Service_workernode03.informatica.com -E

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

]; exit status [-104]..

2018-03-08 11:34:27.355 <AMRM Callback Handler Thread> SEVERE: Service [OOP_Container_Manager_Service_3] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_195] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_194] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_196] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_192] has stopped running..

2018-03-08 11:34:29.856 <AMRM Callback Handler Thread> INFO: Service [DTMProcess_193] has stopped running..

2018-03-08 11:34:42.374 <AMRM Callback Handler Thread> INFO: Container completion status: id [container_e40_1520440847746_0012_01_000012]; 

state [COMPLETE]; diagnostics []; exit status [0]..

2018-03-08 11:34:42.374 <AMRM Callback Handler Thread> SEVERE: Service [Data_Exchange_Framework_Service_3] has stopped running..

2018-03-08 11:34:42.425 <Thread-454> INFO: Connected to service [Orchestrator_Service_1], host [workernode04.informatica.com], port [12309]..

2018-03-08 11:34:43.310 <DTFPool-2-thread-6> INFO: Received notifyEvent for service [Orchestrator_Service_1]

2018-03-08 11:34:43.310 <DTFPool-2-thread-6> INFO: Received Event

2018-03-08 11:34:43.310 <DTFPool-2-thread-6> INFO: Received Event for service [Orchestrator_Service_1]

2018-03-08 11:34:43.311 <DTFPool-2-thread-6> INFO: Reducing pending service count to [0]

2018-03-08 11:34:43.311 <DTFPool-2-thread-6> INFO: [JSF_0010] The data channel used by this callback handler has been closed.  

No new events will be received on this handler.

 

application_1520440847746_0115__Blaze_Grid_Manager_Log (Latest Application)

 

2018-03-12 09:37:26.467 <DTFPool-1-thread-10> INFO: Service [DTMProcess_58] started successfully on host [workernode05.informatica.com], service port [48432]..

2018-03-12 09:37:50.417 <AMRM Callback Handler Thread> INFO: Container completion status: id [container_e40_1520440847746_0115_01_000006]; state [COMPLETE]; 

diagnostics [Container [pid=13651,containerID=container_e40_1520440847746_0115_01_000006] 

is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 3.7 GB of 4.2 GB virtual memory used. Killing container.

Dump of the process-tree for container_e40_1520440847746_0115_01_000006 :

|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE

|- 13653 13651 13651 13651 (java) 37506 5740 3909824512 524067 java -Xmx1728m -XX:MaxMetaspaceSize=100m -XX:+UseConcMarkSweepGC 

-XX:+UseParNewGC -XX:ParallelGCThreads=4 -XX:CICompilerCount=4 -server -XX:ErrorFile=/tmp/infa_blaze_jvm_err_%p.log

-Djava.library.path=/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/

container_e40_1520440847746_0115_01_000006/infa_rpm.tar/services/shared/bin:/u01/app/oracle/product/12.1.0/client_1/lib:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/services/shared/bin:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/DataTransformation/bin:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/

infa_rpm.tar/services/shared/hadoop/cloudera_cdh5u10/lib/native:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/ODBC7.1/lib:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/jre/lib/amd64:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/jre/lib/amd64/server:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/java/jre/lib/amd64:

/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006/infa_rpm.tar/java/jre/lib

/amd64/server:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib/native:

-Djava.util.logging.config.file=/data07/yarn/nm/usercache/informatica/appcache/application_1520440847746_0115/container_e40_1520440847746_0115_01_000006

/infa_rpm.tar/services/shared/bin/../hadoop/logging.properties 

-Djava.security.egd=file:/dev/./urandom com.informatica.dtm.executor.grid.svcfw.launcher.GridProcessLauncher -N OOP_Container_Manager_Service_workernode02.informatica.com

Applies To
Product: Data Engineering Integration(Big Data Management); Data Engineering Quality(Big Data Quality); Enterprise Data Preparation; Enterprise Data Catalog
Problem Type: Configuration; Stability; Sizing; Crash/Hang; Performance
User Type: Administrator; Developer; Data Analyst
Project Phase: Configure; Implement; Optimize
Product Version: Informatica 10.1; Informatica 10.1.1; HotFix; Informatica 10.2; Informatica 10.2.1; Informatica 10.2.1 Service Pack 1; Informatica 10.2.2; Informatica 10.4
Database:
Operating System:
Other Software:

Reference
Attachments
Last Modified Date:4/16/2020 12:36 AMID:528127
People who viewed this also viewed

Feedback

Did this KB document help you?



What can we do to improve this information (2000 or fewer characters)