Skip Ribbon Commands
Skip to main content
Navigate Up
Sign In

Quick Launch

Average Rating:

facebook Twitter
Email
Print Bookmark Alert me when this article is updated

Feedback

FAQ: Why is 'disTemp' location occupying more space in the Informatica BDM Domain installation?
Answer

Starting from Informatica BDM 10.2.0 version, '1-click auto install'  feature is available, which would help in automatic transfer and setup of Informatica BDM packages in Hadoop cluster data node(s), as part of the first Hadoop pushdown job run from Informatica BDM.

 

For implementing the '1-click-auto-install' feature, Informatica BDM package would be created in the form of 'tar'  file in the Informatica server machine, using the binary files available in the Informatica Domain installation and then the created 'tar'  file would be transferred to the HDFS of Hadoop cluster during pushdown mapping execution.

 

'Informatica BDM tar package' created in the Informatica server machine would get stored into the 'disTemp'  location. Size of the 'Informatica BDM tar package' would be anywhere between 2.4 GB-3 GB. Number of 'tar' package files that gets created would depend on the following factors:

 

  • File System changes happened in the Informatica Domain installation - E.g. EBF/ServicePack installation in the Domain, Copying of new jar file into '$INFA_HOME/externaljdbcjars'  location, Importing new SSL certificate into Informatica Trust store ('infa_truststore.jks'), Changes to 'Kerberos configuration file' ('krb5.conf'), Addition of new ODBC DSN to 'odbc.ini'  file and so on..
  • Number of 'Data Integration Service' (DIS) used for running the pushdown jobs.
  • Number of Hadoop clusters being connected from the Informatica Domain.
  • Number of 'Hadoop Staging User'  accounts used in the DIS.

 

For finding the directories occupying more disk space at 'disTemp' location, run the following command once in 'disTemp' location Default Location: '$INFA_HOME/tomcat/bin/disTemp'

 

find . -maxdepth 1 -type d  | xargs -I % du -sh % | egrep '(^[4-9]{3,}[0-9.]*[M])|(^[0-9.]+G)|(^[0-9.]{4}+[GM])'

 

Note: Above command would display the disk usage information of all the folders occupying more than 444 MB.


01_KB_599015_disTemp_Location.png
 

Once the directories consuming more space are found, they can be removed after disabling the running DIS. 


More Information
   

To get more information on the disk usage of different possible folders at 'disTemp' location, execute the following commands from 'disTemp' location using 'bash' shell, depending on the requirement:

 

 

 

Disk Usage Report Type

 

 

Folder Type

 

Command Description

 

Command to be run

 

 

Summary

 

Spark Session Configuration & Compiler folders, INFA_RPM folders, Non-INFA_BDM folders

 

Total Folders Count - Spark Session Config & Compiler folders, INFA_RPM folders, Non-INFA_BDM folders

 

##Summary -- Total Spark Session Configuration Folders

 find $PWD -maxdepth 1 -type d  | egrep '(hadoop)[0-9]+' | wc -l

 

##Summary -- Total Spark Session Compiler Folders 

 find $PWD -maxdepth 1 -type d  | egrep '(sess[0-9]+)|(scalacache)' | wc -l

 

 ##Summary -- Total INFA_RPM Folders

 find $PWD -maxdepth 1 -type d  | egrep '(infa_)+' | wc -l

 

 ##Summary -- Total Non-BDM Session Folders

 find $PWD -maxdepth 1 -type d  | egrep -v '(infa_)+' | egrep -v '(sess[0-9]+)|(scalacache)'| egrep -v '(hadoop[0-9]+)' | wc -l

 

 

Summary

 

Spark Session Configuration Folders

 

Spark Session Configuration Folders Disk Usage - Summary Information Report

 

 

##Summary -- Spark Session Configuration Folders Disk Usage

 export set tmp_session_file="spark_session_config_folders_list.txt";  find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep '(hadoop[0-9]+)' > ${tmp_session_file};  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; fi; done<${tmp_session_file};  rm ​-f ${tmp_session_file}; echo $'\n'"Total Spark Session Config Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Summary

 

Spark Session Compiler Folders

 

Spark Session Compiler Folders Disk Usage - Summary Information Report

 

 

 ##Summary -- Spark Session Compiler Folders Disk Usage

 export set tmp_session_file="spark_session_compiler_folders_list.txt";  find $PWD -maxdepth 1 -type d | grep $PWD/  |  egrep '(sess[0-9]+)|(scalacache)' > ${tmp_session_file};  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo $'\n'"Total Spark Session Compiler Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Summary

 

INFA_RPM Folders

 

All INFA_RPM Folders Disk Usage - Summary Information Report

 

 

 ##Summary -- All INFA_RPM Folders Disk Usage

 export set tmp_session_file="all_non_session_folders_list.txt";  find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep '(infa_)+' > ${tmp_session_file};  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo $'\n'"Total All INFA RPM Package Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Summary

 

Non-BDM Folders

 

All Non-BDM Folders Disk Usage - Summary Information Report

 

   ##Summary -- All Non-BDM Session Folders Disk Usage

 export set tmp_session_file="all_non_bdm_session_folders_list.txt";  find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep -v '(infa_)+' |  egrep -v '(sess[0-9]+)|(scalacache)'| egrep -v '(hadoop[0-9]+)' > ${tmp_session_file};  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo $'\n'"Total All Non-BDM Session Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Detailed

 

Spark Session Configuration Folders

 

Spark Session Configuration Folders Disk Usage - Detailed Information Report

 

##Verbose -- Spark Session Configuration Folders Disk Usage

export set tmp_session_file="spark_session_config_folders_list.txt"; find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep '(hadoop[0-9]+)' > ${tmp_session_file} ;  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; echo "       Folder: $current_dir, Size: $current_folder_size KB (`echo "scale=2;$current_folder_size/1024" | bc -q -l` MB)"$'\n'; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo $'\n'"Total Spark Session Config Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Detailed

 

Spark Session Compiler Folders

 

Spark Session Compiler Folders Disk Usage - Summary Information Report

 

##Verbose -- Spark Session Compiler Folders Disk Usage

  export set tmp_session_file="spark_session_compiler_folders_list.txt"; find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep '(sess[0-9]+)|(scalacache)' > ${tmp_session_file} ;  export set total_folder_size=0; echo " "$'\n';while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; echo "       Folder: $current_dir, Size: $current_folder_size KB (`echo "scale=2;$current_folder_size/1024" | bc -q -l` MB)"$'\n'; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo $'\n'"Total Spark Session Compiler Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Detailed

 

INFA_RPM Folders

 

All INFA_RPM Folders Disk Usage - Detailed Information Report

 

##Verbose -- All INFA_RPM Folders Disk Usage

  export set tmp_session_file="all_non_session_folders_list.txt"; find $PWD -maxdepth 1 -type d | grep $PWD/  |  egrep '(infa_)+' > ${tmp_session_file} ;  export set total_folder_size=0; echo " "$'\n'; while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; echo "       Folder: $current_dir, Size: $current_folder_size KB (`echo "scale=2;$current_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$current_folder_size/1024/1024" | bc -q -l` GB)"$'\n'; fi; done<${tmp_session_file};  rm -f ${tmp_session_file}; echo '$total_folder_size' | echo $'\n'"Total All INFA RPM Package Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 

 

Detailed

 

Non-BDM Folders

 

All Non-BDM Folders Disk Usage - Detailed Information Report

 

##Verbose -- All Non-BDM Session Folders Disk Usage

  export set tmp_session_file="all_non_session_folders_list.txt"; find $PWD -maxdepth 1 -type d | grep $PWD/ |  egrep -v '(infa_)+' |   egrep -v '(sess[0-9]+)|(scalacache)' | egrep -v '(hadoop[0-9]+)' > ${tmp_session_file} ;  export set total_folder_size=0; echo " "$'\n'; while read current_dir; do current_folder_size=`du -s  $current_dir | awk '{print $1}'` ; if [ "x$current_folder_size" == "x" ] ; then current_folder_size=0; else total_folder_size=`expr $total_folder_size + $current_folder_size`; echo "       Folder: $current_dir, Size: $current_folder_size KB (`echo "scale=2;$current_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$current_folder_size/1024/1024" | bc -q -l` GB)"$'\n'; fi; done<${tmp_session_file};  rm -f  ${tmp_session_file}; echo '$total_folder_size' | echo $'\n'"Total All Non-BDM Session Folders Size: $total_folder_size KB (`echo "scale=2;$total_folder_size/1024" | bc -q -l` MB) (`echo "scale=2;$total_folder_size/1024/1024" | bc -q -l` GB)"$'\n'

 




Sample Execution Screenshots

02_KB_599015_disTemp_Location.png
04_KB_599015_disTemp_Location.png
06_KB_599015_disTemp_Location.png
08_KB_599015_disTemp_Location.png
03_KB_599015_disTemp_Location.png
05_KB_599015_disTemp_Location.png
07_KB_599015_disTemp_Location.png
09_KB_599015_disTemp_Location.png


Applies To
Product: Data Engineering Quality(Big Data Quality); Data Engineering Integration(Big Data Management); Enterprise Data Preparation; Enterprise Data Catalog; Data Engineering Streaming(Big Data Streaming)
Problem Type: Sizing; Configuration; Product Feature
User Type: Administrator
Project Phase: Configure; Implement; Optimize
Product Version: Informatica 10.2; Informatica 10.2.1; Informatica 10.2.1 Service Pack 1; Informatica 10.2.2; HotFix
Database:
Operating System: Linux
Other Software:

Reference

Attachments

Last Modified Date:11/4/2019 10:56 PMID:599015
People who viewed this also viewed

Feedback

Did this KB document help you?



What can we do to improve this information (2000 or fewer characters)