HOME error with upgrade to Spark 1.3.0

I'm trying to upgrade a Spark project, written in Scala, from Spark 1.2.1 to 1.3.0, so I changed my build.sbt like so:

-libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.1" % "provided"
+libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" % "provided"

then make an assembly jar, and submit it:

HADOOP_CONF_DIR=/etc/hadoop/conf 
    spark-submit 
    --driver-class-path=/etc/hbase/conf 
    --conf spark.hadoop.validateOutputSpecs=false 
    --conf spark.yarn.jar=hdfs:/apps/local/spark-assembly-1.3.0-hadoop2.4.0.jar 
    --conf spark.serializer=org.apache.spark.serializer.KryoSerializer 
    --deploy-mode=cluster 
    --master=yarn 
    --class=TestObject 
    --num-executors=54 
    target/scala-2.11/myapp-assembly-1.2.jar

The job fails to submit, with the following exception in the terminal:

15/03/19 10:30:07 INFO yarn.Client: 
15/03/19 10:20:03 INFO yarn.Client: 
     client token: N/A
     diagnostics: Application application_1420225286501_4698 failed 2 times due to AM 
     Container for appattempt_1420225286501_4698_000002 exited with  exitCode: 127 
     due to: Exception from container-launch: 
org.apache.hadoop.util.Shell$ExitCodeException: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
    at org.apache.hadoop.util.Shell.run(Shell.java:379)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

Finally, I go and check the YARN app master's web interface (since the job is there, I know it at least made it that far), and the only logs it shows are these:

    Log Type: stderr
    Log Length: 61
    /bin/bash: {{JAVA_HOME}}/bin/java: No such file or directory

    Log Type: stdout
    Log Length: 0

I'm not sure how to interpret that – is {{JAVA_HOME}} a literal (including the brackets) that's somehow making it into a script? Is this coming from the worker nodes or the driver? Anything I can do to experiment & troubleshoot?

I do have JAVA_HOME set in the hadoop config files on all the nodes of the cluster:

% grep JAVA_HOME /etc/hadoop/conf/*.sh
/etc/hadoop/conf/hadoop-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31
/etc/hadoop/conf/yarn-env.sh:export JAVA_HOME=/usr/jdk64/jdk1.6.0_31

Has this behavior changed in 1.3.0 since 1.2.1? Using 1.2.1 and making no other changes, the job completes fine.

[Note: I originally posted this on the Spark mailing list, I'll update both places if/when I find a solution.]


Have you tried setting JAVA_HOME in the etc/hadoop/yarn-env.sh file? It's possible that your JAVA_HOME environment variable not available to the YARN containers that are running your job.

It has happened to me before that certain env variables that were in the .bashrc on the nodes were not being read by the yarn workers spawned on the cluster.

There is a chance that the error is unrelated to the version upgrade but instead related to YARN environment configuration.


Okay, so I got some other people in the office to help work on this, and we figured out a solution. I'm not sure how much of this is specific to the file layouts of Hortonworks HDP 2.0.6 on CentOS, which is what we're running on our cluster.

We manually copy some directories from one of the cluster machines (or any machine that can successfully use the Hadoop client) to your local machine. Let's call that machine $GOOD .

Set up Hadoop config files:

cd /etc
sudo mkdir hbase hadoop
sudo scp -r $GOOD:/etc/hbase/conf hbase
sudo scp -r $GOOD:/etc/hadoop/conf hadoop

Set up Hadoop libraries & executables:

mkdir ~/my-hadoop
scp -r $GOOD:/usr/lib/hadoop* ~/my-hadoop
cd /usr/lib
sudo ln –s ~/my-hadoop/* .
path+=(/usr/lib/hadoop*/bin)  # Add to $PATH (this syntax is for zsh)

Set up the Spark libraries & executables:

cd ~/Downloads
wget http://apache.mirrors.lucidnetworks.net/spark/spark-1.4.1/spark-1.4.1-bin-without-hadoop.tgz
tar -zxvf spark-1.4.1-bin-without-hadoop.tgz
cd spark-1.4.1-bin-without-hadoop
path+=(`pwd`/bin)
hdfs dfs -copyFromLocal lib/spark-assembly-*.jar /apps/local/

Set some environment variables:

export JAVA_HOME=$(/usr/libexec/java_home -v 1.7)
export HADOOP_CONF_DIR=/etc/hadoop/conf
export SPARK_DIST_CLASSPATH=$(hadoop --config $HADOOP_CONF_DIR classpath)
`grep 'export HADOOP_LIBEXEC_DIR' $HADOOP_CONF_DIR/yarn-env.sh`
export SPOPTS="--driver-java-options=-Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib"
export SPOPTS="$SPOPTS --conf spark.yarn.jar=hdfs:/apps/local/spark-assembly-1.4.1-hadoop2.2.0.jar"

Now the various spark shells can be run like so:

sparkR --master yarn $SPOPTS
spark-shell --master yarn $SPOPTS
pyspark --master yarn $SPOPTS

Some remarks:

  • The JAVA_HOME setting is the same as I've had all along - just included it here for completion. All the focus on JAVA_HOME turned out to be a red herring.
  • The --driver-java-options=-Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib was necessary because I was getting errors about java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path . The jnilib file is the correct choice for OS X.
  • The --conf spark.yarn.jar piece is just to save time, avoiding re-copying the assembly file to the cluster every time you fire up the shell or submit a job.

  • Well, to start off I would recommend you to move to Java 7. However, that is not what you are looking for or need help with.

    For setting JAVA_HOME, I would recommend you set it in your bashrc, rather than setting in multiple files. Moreover, I would recommend you installing java with alternatives to /usr/bin.

    链接地址: http://www.djcxy.com/p/25236.html

    上一篇: 带下标的Unicode字符

    下一篇: 升级到Spark 1.3.0的HOME错误