Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.hbase.client.Result 问题:spark操作HBase的时候报错Result为序列化问题

报错:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.hbase.client.Result
Serialization stack:
	- object not serializable (class: org.apache.hadoop.hbase.client.Result, value: keyvalues={tjiloaB#3#20190520/wifiTargetCF:inNum/1581004136600/Put/vlen=4/seqid=0, tjiloaB#3#20190520/wifiTargetCF:outNum/1581004136600/Put/vlen=4/seqid=0})
	- element of array (index: 0)
	- array (class [Lorg.apache.hadoop.hbase.client.Result;, size 1)
	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
	at scala.Option.foreach(Option.scala:236)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858)
	at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1328)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.take(RDD.scala:1302)
	at cn.com.winner.TestHbase2$.main(TestHbase2.scala:39)
	at cn.com.winner.TestHbase2.main(TestHbase2.scala)
23:27:47.914 [Thread-2] INFO  org.apache.spark.SparkContext - Invoking stop() from shutdown hook

 问题解决:

  在SparkConf中设置序列化选项:

val conf = new SparkConf().setMaster("local[2]").setAppName("testHbase")
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")

  注意,这个设置项如果加在在HBaseConf中是没有用的,比如:

 val hconf = HBaseConfiguration.create()
 hconf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")