您的位置: 首页 > IT文章 > Spark l Spark启启动时的Master参数是什么？

Spark l Spark启启动时的Master参数是什么？

分类: IT文章 • 2023-11-23 17:37:01

Master应该就是管理资源分配的节点模式设定首先说明一下Spark作业的部署模式。部署Spark的计算框架，有多种方式，可以部署到一台计算机，也可以是多台(cluster)。计算机越多，则集群规模越大，我们的计算力就越强。

一.local模式

local模式即本地化模式，即单台计算机模式，它可以通过以下集中方式设置master。

local 只运行在一个线程，无并行模式
local[k] k是指定使用几个线程来并行计算，即启动k个worker线程，通常计算机有几个core，就设置为几。也可以写为local[*],自动判定worker数量。

/bin/spark-submit 
--cluster cluster_name 
--master local[*]

二.cluster模式

cluster就是集群模式，又细分为standalone、mesos、yarn三种模式，区别在于谁来管理资源调度

2.1. standalone模式

Spark会自己负责资源的管理调度。它将cluster中的机器分为master机器和worker机器，master通常就一个，worker就是负责计算任务。

/bin/spark-submit 
--cluster cluster_name 
--master spark://host:port

2.2. mesos模式

如果使用mesos来管理资源调度，就是用mesos模式。此时指定的是master为mesos://HOST:PORT

/bin/spark-submit 
--cluster cluster_name 
--master mesos://host:port

2.3. yarn模式

如果采用yarn来管理资源调度，就应该用yarn模式，由于很多时候我们需要和mapreduce使用同一个集群，所以都采用Yarn来管理资源调度，这也是生产环境大多采用yarn模式的原因。yarn模式又分为yarn cluster模式和yarn client模式：

yarn cluster: 这个就是生产环境常用的模式，所有的资源调度和计算都在集群环境上运行。此时指定的master为yarn-cluster
yarn client: 这个是说Spark Driver和ApplicationMaster进程均在本机运行，而计算任务在cluster上。此时指定的master为yarn-client

/bin/spark-submit 
--cluster cluster_name 
--master yarn-cluster