火花提交的Cloudera集群找不到任何相关的JAR

问题描述:

我能够做一个火花向我Cloudera的集群。有例外抱怨无法找到各种类别几分钟后就业死亡。这些是在火花依赖性路径类。我一直使用命令行参数--jars一次添加的罐子之一,纱线日志保留倾出下一个罐子无法找到。

I am able to do a spark-submit to my cloudera cluster. the job dies after a few minutes with exceptions complaining it can not find various classes. These are classes that are in the spark dependency path. I keep adding the jars one at a time using command line args --jars, the yarn log keeps dumping out the next jar it can't find.

什么设​​置允许火花/纱线的工作,找到所有的依赖罐子?

What setting allows the spark/yarn job to find all the dependent jars?

我已经设置了spark.home属性的正确路径 - 的/ opt / Cloudera的/包裹/ CDH / lib目录/火花

I already set the "spark.home" attribute to the correct path - /opt/cloudera/parcels/CDH/lib/spark

我找到了!

删除

.SET(spark.driver.host,驱动程序的计算机IP地址​​)

.set("spark.driver.host", "driver computer ip address")

这是你的驱动code。

from your driver code.