Skip Ribbon Commands
Skip to main content
Navigate Up
Sign In

Quick Launch

Average Rating:

facebook Twitter
Email
Print Bookmark Alert me when this article is updated

Feedback

ERROR: "java.lang.RuntimeException: Cannot find the distribution file: CDH_5.13/infa_resources.xml" while running Hadoop pushdown mappings after upgrading to Informatica 10.4.0 version
Problem Description
​In Data Engineering integration (DEI), while running Hadoop pushdown mapping in Informatica 10.4.0 version, after upgrading from earlier 10.2.1 or 10.2.2.x Informatica versions, mapping execution fails with an error message similar to the following:

 

java.lang.RuntimeException: Cannot find the distribution file: CDH_5.13/infa_resources.xml

 

Mapping Log Trace

 

2019-12-12 00:04:52.469 <MappingCompiler-pool-4-thread-2> INFO: [LDTM_0026] LDTM: Starting mapping compilation.

2019-12-12 00:04:52.503 <MappingCompiler-pool-4-thread-2> INFO: [LDTMCMN_0037] The Hadoop distribution directory is defined in Data Integration Service properties at the path [/data/informatica/1040/services/shared/hadoop/CDH_5.13].

2019-12-12 00:04:52.503 <MappingCompiler-pool-4-thread-2> INFO: [CLUSTERCONF_10024] The cluster configuration [GCS_BDM_CDH_Multi_Node_Cluster_ths] is unchanged from the last export. Using the existing export file [/data/informatica/1040/tomcat/bin/disTemp/D_Astras/DIS_Astras_CDH/n_v07_Brahmaastra/gcs_bdm_cdh_multi_node_cluster_ths/NATIVE/ea6554bd-7007-4351-b3d9-451f4ca5c44d/infacco-site.xml].

2019-12-12 00:04:52.503 <MappingCompiler-pool-4-thread-2> INFO: [CLUSTERCONF_10028] Based on the distribution [CLOUDERA] and the run-time engine [NATIVE], the Data Integration Service will override the following cluster configuration properties at run time: null

2019-12-12 00:04:52.504 <MappingCompiler-pool-4-thread-2> INFO:  Started creating static ports for the generated ports in the transformation instance [Expression].

2019-12-12 00:04:52.505 <MappingCompiler-pool-4-thread-2> INFO:  Completed creating static ports for the generated ports in the transformation instance [Expression].

2019-12-12 00:04:52.505 <MappingCompiler-pool-4-thread-2> INFO:  Started creating links for the static ports that got created from the generated ports in the transformation instance [Expression].

….

….

2019-12-12 00:04:52.566 <MappingCompiler-pool-4-thread-2> INFO: Early uncorrelated subquery optimization is finished.

2019-12-12 00:04:52.566 <MappingCompiler-pool-4-thread-2> INFO: [OPT_1700] The Integration Service is starting the Global Predicate Optimization method.

2019-12-12 00:04:52.583 <MappingCompiler-pool-4-thread-2> WARNING: No metadata for: [Read_disk_space_usage_summary] because it threw: Cannot find the distribution file: CDH_5.13/infa_resources.xml

java.lang.RuntimeException: Cannot find the distribution file: CDH_5.13/infa_resources.xml

at com.informatica.cluster.conf.dist.util.InfaConfLoader.<init>(InfaConfLoader.java:130)

at com.informatica.cluster.conf.dist.util.InfaConfLoader.get(InfaConfLoader.java:82)

at com.informatica.cluster.conf.dist.util.DistributionResUtil.getHadoopClassPath(DistributionResUtil.java:91)

at com.informatica.platform.dtm.executor.hadoop.runtime.utils.impl.HadoopRuntimePropsUtilImpl.<init>(HadoopRuntimePropsUtilImpl.java:59)

at com.informatica.platform.dtm.executor.hadoop.runtime.utils.HadoopRuntimePropsUtil.getHadoopRuntimeProps(HadoopRuntimePropsUtil.java:90)

at com.informatica.platform.dtm.executor.hadoop.runtime.utils.HadoopRuntimePropsUtil.getHadoopRuntimeProps(HadoopRuntimePropsUtil.java:117)

at com.informatica.platform.dtm.executor.hadoop.impl.IHadoopFactoryImpl.initWithExecClassLoader(IHadoopFactoryImpl.java:557)

at com.informatica.platform.dtm.executor.hadoop.impl.IHadoopFactoryImpl.<init>(IHadoopFactoryImpl.java:134)

at com.informatica.platform.dtm.executor.hadoop.IHadoopFactory.newHadoopFactory(IHadoopFactory.java:43)

at com.informatica.products.datasourcepartitionhelper.impl.HDFSPartitioner.<init>(HDFSPartitioner.java:188)

at com.informatica.products.statistics.impl.statshelper.FlatByteParseStatisticsHelper.initRuntimeConfig(FlatByteParseStatisticsHelper.java:178)

at com.informatica.products.statistics.impl.statshelper.FlatByteParseStatisticsHelper.init(FlatByteParseStatisticsHelper.java:69)

at com.informatica.products.statistics.impl.utils.StatisticsHelperFactory.getDirectStatisticsHelper(StatisticsHelperFactory.java:118)

at com.informatica.products.statistics.impl.statshelper.ExternalStatisticsHelper.init(ExternalStatisticsHelper.java:53)

at com.informatica.products.statistics.impl.utils.StatisticsHelperFactory.getExternalStatisticsHelper(StatisticsHelperFactory.java:163)

at com.informatica.products.statistics.impl.statshelper.PersistedStatisticsHelper.init(PersistedStatisticsHelper.java:67)

at com.informatica.products.statistics.impl.utils.StatisticsHelperFactory.getPersistedStatisticsHelper(StatisticsHelperFactory.java:149)

at com.informatica.products.statistics.impl.statshelper.CachedStatisticsHelper.init(CachedStatisticsHelper.java:52)

at com.informatica.products.statistics.impl.utils.StatisticsHelperFactory.getCachedStatisticsHelper(StatisticsHelperFactory.java:131)

at com.informatica.products.statistics.impl.utils.StatisticsHelperFactory.getStatisticsHelper(StatisticsHelperFactory.java:180)

at com.informatica.platform.ldtm.optimizer.utils.datasettracking.OptHints.absorbStatistics(OptHints.java:416)

at com.informatica.platform.ldtm.optimizer.utils.datasettracking.DatasetTrackingUtil.absorbStatistics(DatasetTrackingUtil.java:200)

at com.informatica.platform.ldtm.optimizer.utils.datasettracking.DatasetTrackingUtil.absorbHints(DatasetTrackingUtil.java:113)

at com.informatica.platform.ldtm.optimizer.utils.datasettracking.DatasetTrackingUtil.<init>(DatasetTrackingUtil.java:75)

at com.informatica.platform.ldtm.util.metadata.FallBackMetadata.construct(FallBackMetadata.java:193)

at com.informatica.platform.ldtm.util.metadata.FallBackMetadata.make(FallBackMetadata.java:159)

at com.informatica.platform.ldtm.util.metadata.MetadataCalculator$1.actualWork(MetadataCalculator.java:54)

at com.informatica.platform.ldtm.util.metadata.MetadataCalculator$1.actualWork(MetadataCalculator.java:1)

at com.informatica.platform.ldtm.lime.component.LimeAnnotation$WeakCache.get(LimeAnnotation.java:30)

at com.informatica.platform.ldtm.util.metadata.MetadataCalculator.metadata(MetadataCalculator.java:62)

at com.informatica.platform.ldtm.util.metadata.MetadataPropagator$1.txOp(MetadataPropagator.java:48)

at com.informatica.platform.ldtm.lime.traversals.TraversalOp$Base.run(TraversalOp.java:82)

at com.informatica.platform.ldtm.util.metadata.MetadataPropagator.<init>(MetadataPropagator.java:55)

at com.informatica.platform.ldtm.util.requirements.RequirementPropagator.<init>(RequirementPropagator.java:32)

at com.informatica.platform.ldtm.optimizer.simplifier.Simplifier$Traversal.<init>(Simplifier.java:44)

at com.informatica.platform.ldtm.optimizer.simplifier.Simplifier.run(Simplifier.java:29)

at com.informatica.platform.ldtm.optimizer.Optimizer.optimize(Optimizer.java:339)

at com.informatica.platform.ldtm.impl.TransformationMachineImpl.engineSpecificInit(TransformationMachineImpl.java:1554)

at com.informatica.platform.ldtm.impl.TransformationMachineImpl.<init>(TransformationMachineImpl.java:863)

at com.informatica.platform.ldtm.TransformationMachineFactoryImpl.createInstance(TransformationMachineFactoryImpl.java:506)

at com.informatica.ds.server.impl.TransformationMachineDISImpl.internalSubmitOperation(TransformationMachineDISImpl.java:776)

at com.informatica.ds.server.impl.TransformationMachineDISImpl.internalSubmitOperation(TransformationMachineDISImpl.java:576)

at com.informatica.ds.server.impl.TransformationMachineDISImpl.submitOperation(TransformationMachineDISImpl.java:427)

at com.informatica.ds.ms.service.pipeline.MappingService4Impl.startJobWithLdtm(MappingService4Impl.java:2345)

at com.informatica.ds.ms.service.pipeline.MappingService4Impl.startDequeuedJob(MappingService4Impl.java:1731)

at com.informatica.ds.ms.service.pipeline.MappingService4Impl.lambda$11(MappingService4Impl.java:1579)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

2019-12-12 00:04:52.591 <MappingCompiler-pool-4-thread-2> INFO: [OPT_1701] The Integration Service has finished applying the Global Predicate Optimization method.

2019-12-12 00:04:52.591 <MappingCompiler-pool-4-thread-2> INFO: [OPT_0500] Predicate optimization is starting.

2019-12-12 00:04:52.591 <MappingCompiler-pool-4-thread-2> INFO: [OPT_0501] Predicate optimization is finished.

Cause
From Informatica 10.2.1 version, 'Data Integration Service Hadoop Distribution Directory' would be decided based on the 'Distribution Type' & 'Distribution version' defined in the CCO. For more information on the same, refer to the KB 533148.

 

The encountered issue occurs when the 'Distribution Type' and 'Distribution Version' in the CCO are not resolving to one of the existing Hadoop distribution folders in the Informatica server machine.

Solution
To resolve the issue, perform the following steps:

  1. Find the 'Cluster Configuration Object' (CCO) associated with the Hadoop connection used for running the failing mapping.

 

infa_bdm_hadoop_connection_cco_name.png

2. Update the value of 'Distribution version' attribute in the 'CCO' to the actual Hadoop distribution version or to the one that is closer to the actual Hadoop distribution version, based on the available distribution folders in the Informatica server machine. 


For instance, say if the 'Distribution Version'  in the CCO, before the upgrade to 10.4.0 is '5.13' of Cloudera, then update the value to '5.15', so that Hadoop distribution folder of name - 'CDH_5.15' - would be chosen during the mapping execution.


 infa_bdm_cco_distribution_version.png


infa_bdm_1040_hadoop_distribution_folders.png 



infa_bdm_update_cco_distribution_version1.png 




3. Once the 'Distribution Version' in the CCO has been updated, save the changes.
4. Recycle the 'Data Integration Service' (DIS) used for the mapping execution.
5. On the Analyst service, go to the Processes tab and edit the following part of the JVM command-line options:
 -DINFA_HADOOP_DIST_DIR=../../services/shared/hadoop/CDH_5.13
Ensure that DINFA_HADOOP_DIST_DIR​ points to a valid path that corresponds to your distribution. 

 ​Once the DIS gets recycled, on running the mapping, it should proceed without the encountered error.

More Information
Applies To
Product: Data Engineering Integration(Big Data Management); Data Engineering Quality(Big Data Quality); Enterprise Data Preparation; Enterprise Data Catalog
Problem Type: Configuration; Connectivity
User Type: Administrator; Developer
Project Phase: Implement; Configure
Product Version: Informatica 10.2.1 Service Pack 1; Informatica 10.2.2; HotFix; Informatica 10.4
Database:
Operating System:
Other Software:

Reference
Attachments
Last Modified Date:3/30/2020 10:13 PMID:612217
People who viewed this also viewed

Feedback

Did this KB document help you?



What can we do to improve this information (2000 or fewer characters)