Set up the hdp.version parameter to resolve the Hortonworks version issue
Hortonworks relies on the hdp.version
environment variable in its configuration files to support its rolling upgrades. This
can lead to a known issue when running Spark jobs in Talend Studio.
The Studio retrieves these Hortonworks configuration files along with this hdp.version variable from a Hortonworks cluster. When you define the Hadoop connection in the Studio, the Studio generates a hadoop-conf-xxx.jar using these configuration files and adds this JAR file, thus along with this variable, to the classpath of your Job. This may lead to the following known issue:
[ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext. org.apache.spark.SparkException: hdp.version is not found, Please set HDP_VERSION=xxx in spark-env.sh, or set -Dhdp.version=xxx in spark.{driver|yarn.am}.extraJavaOptions or set SPARK_JAVA_OPTS="-Dhdp.verion=xxx" in spark-env.sh If you're running Spark under HDP. at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:999) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) at org.apache.spark.SparkContext.<init>(SparkContext.scala:509) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.runJobInTOS(test_hdp_1057.java:1454) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.main(test_hdp_1057.java:1341)
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. Diagnostics: Exception from container-launch. Container id: container_1496650120478_0011_02_000001 Exit code: 1 Exception message: /hadoop/yarn/local/usercache/abbass/appcache/application_1496650120478_0011/container_1496650120478_0011_02_000001/launch_container.sh: line 21: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
java.lang.IllegalArgumentException: Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI, check the setting for mapreduce.application.framework.path at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:443) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) at org.talend.hadoop.mapred.lib.MRJobClient.runJob(MRJobClient.java:46) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runMRJob(test_mr_hdp26.java:1556) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.access$2(test_mr_hdp26.java:1546) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1194) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.tRowGenerator_1Process(test_mr_hdp26.java:1044) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.run(test_mr_hdp26.java:1524) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runJobInTOS(test_mr_hdp26.java:1483) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.main(test_mr_hdp26.java:1431) Caused by: java.net.URISyntaxException: Illegal character in path at index 11: /hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parseHierarchical(URI.java:3105) at java.net.URI$Parser.parse(URI.java:3063) at java.net.URI.<init>(URI.java:588) at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:441) ... 27 more
These error messages can appear together or separately.
Environment:
-
Subscription-based Talend Studio solution with Big Data
-
Spark or MapReduce Jobs