Set up the hdp.version parameter to resolve the Hortonworks version issue (deprecated)
Hortonworks relies on the hdp.version environment variable in
its configuration files to support its rolling upgrades. This can lead to a known issue
when running Spark Jobs in Talend Studio.
Talend Studio retrieves these Hortonworks configuration files along with this hdp.version variable from a Hortonworks cluster. When you define the Hadoop connection in Talend Studio, Talend Studio generates a hadoop-conf-xxx.jar using these configuration files and adds this JAR file, thus along with this variable, to the classpath of your Job. This may lead to the following known issue:
[ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext. org.apache.spark.SparkException: hdp.version is not found, Please set HDP_VERSION=xxx in spark-env.sh, or set -Dhdp.version=xxx in spark.{driver|yarn.am}.extraJavaOptions or set SPARK_JAVA_OPTS="-Dhdp.verion=xxx" in spark-env.sh If you're running Spark under HDP. at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:999) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) at org.apache.spark.SparkContext.<init>(SparkContext.scala:509) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.runJobInTOS(test_hdp_1057.java:1454) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.main(test_hdp_1057.java:1341)
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. Diagnostics: Exception from container-launch. Container id: container_1496650120478_0011_02_000001 Exit code: 1 Exception message: /hadoop/yarn/local/usercache/abbass/appcache/application_1496650120478_0011/container_1496650120478_0011_02_000001/launch_container.sh: line 21: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
java.lang.IllegalArgumentException: Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI, check the setting for mapreduce.application.framework.path at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:443) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) at org.talend.hadoop.mapred.lib.MRJobClient.runJob(MRJobClient.java:46) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runMRJob(test_mr_hdp26.java:1556) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.access$2(test_mr_hdp26.java:1546) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1194) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.tRowGenerator_1Process(test_mr_hdp26.java:1044) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.run(test_mr_hdp26.java:1524) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runJobInTOS(test_mr_hdp26.java:1483) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.main(test_mr_hdp26.java:1431) Caused by: java.net.URISyntaxException: Illegal character in path at index 11: /hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parseHierarchical(URI.java:3105) at java.net.URI$Parser.parse(URI.java:3063) at java.net.URI.<init>(URI.java:588) at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:441) ... 27 more
These error messages can appear together or separately.
Environment:
-
Subscription-based Talend Studio solution with Big Data
-
Spark or MapReduce Jobs