Spark : check your cluster UI to ensure that workers are registered

问题

I have a simple program in Spark:

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setMaster("spark://10.250.7.117:7077").setAppName("Simple Application").set("spark.cores.max","2")
    val sc = new SparkContext(conf)    
    val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv")

    //first get the first 10 records 
    println("Getting the first 10 records: ")
    ratingsFile.take(10)    

    //get the number of records in the movie ratings file
    println("The number of records in the movie list are : ")
    ratingsFile.count() 
  }
}

When I try to run this program from the spark-shell i.e. I log into the name node (Cloudera installation) and run the commands sequentially on the spark-shell:

val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv")
println("Getting the first 10 records: ")
ratingsFile.take(10)    
println("The number of records in the movie list are : ")
ratingsFile.count()

I get correct results, but if I try to run the program from eclipse, no resources are assigned to program and in the console log all I see is:

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Also, in the Spark UI, I see this:

Job keeps Running - Spark

Also, it should be noted that this version of spark was installed with Cloudera (hence no worker nodes show up).

What should I do to make this work?

EDIT:

I checked the HistoryServer and these jobs don't show up there (even in incomplete applications)

回答1:

I have done configuration and performance tuning for many spark clusters and this is a very common/normal message to see when you are first prepping/configuring a cluster to handle your workloads.

This is unequivocally due to insufficient resources to have the job launched. The job is requesting one of:

more memory per worker than allocated to it (1GB)
more CPU's than available on the cluster

回答2:

Finally figured out what the answer is.

When deploying a spark program on a YARN cluster, the master URL is just yarn.

So in the program, the spark context should just looks like:

val conf = new SparkConf().setAppName("SimpleApp")

Then this eclipse project should be built using Maven and the generated jar should be deployed on the cluster by copying it to the cluster and then running the following command

spark-submit --master yarn --class "SimpleApp" Recommender_2-0.0.1-SNAPSHOT.jar

This means that running from eclipse directly would not work.

回答3:

You can check your cluster's work node cores: your application can't exceed that. For example, you have two work node. And per work node you have 4 cores. Then you have 2 applications to run. So you can give every application 4 cores to run the job.

You can set like this in the code:

SparkConf sparkConf = new SparkConf().setAppName("JianSheJieDuan")
                          .set("spark.cores.max", "4");

It works for me.

回答4:

There are also some causes of this same error message other than those posted here.

For a spark-on-mesos cluster, make sure you have java8 or newer java version on mesos slaves.

For spark standalone, make sure you have java8 (or newer) on the workers.

回答5:

You don't have any workers to execute the job. There are no available cores for the job to execute and that's the reason the job's state is still in 'Waiting'.

If you have no workers registered with Cloudera how will the jobs execute?

来源：https://stackoverflow.com/questions/35662596/spark-check-your-cluster-ui-to-ensure-that-workers-are-registered

标签

scala

Hadoop

apache-spark

Cloudera

cloudera-manager