What is the concept of application, job, stage and task in spark?

后端 未结 3 785
你的背包
你的背包 2020-12-13 00:08

Is my understanding right?

  1. Application: one spark submit.

  2. job: once a lazy evaluation happens, there is a job.

  3. stage: It

3条回答
  •  半阙折子戏
    2020-12-13 01:03

    From 7-steps-for-a-developer-to-learn-apache-spark

    An anatomy of a Spark application usually comprises of Spark operations, which can be either transformations or actions on your data sets using Spark’s RDDs, DataFrames or Datasets APIs. For example, in your Spark app, if you invoke an action, such as collect() or take() on your DataFrame or Dataset, the action will create a job. A job will then be decomposed into single or multiple stages; stages are further divided into individual tasks; and tasks are units of execution that the Spark driver’s scheduler ships to Spark Executors on the Spark worker nodes to execute in your cluster. Often multiple tasks will run in parallel on the same executor, each processing its unit of partitioned dataset in its memory.

提交回复
热议问题