在使用setAsNewAPIHadoopDataset写数据到Hbase时发生如下错误:
: Can not create a Path from a null string
at (:123)
at .<init>(:135)
at .<init>(:89)
at (:58)
at (:132)
at $.write(:101)
at $$anonfun$saveAsNewAPIHadoopDataset$$mcV$sp(:1085)
at $$anonfun$saveAsNewAPIHadoopDataset$(:1085)
at $$anonfun$saveAsNewAPIHadoopDataset$(:1085)
at $.withScope(:151)
at $.withScope(:112)
at (:362)
at (:1084)
写Hbase的源代码如下:(Scala版) Spark 2.2
import
import
import
import
import
import
import .{SparkConf, SparkContext}
/**
* Description: Put data into Hbase by map reduce Job.
*
* Author : Adore Chen
* Created: 2017-12-22
*/
object SparkMapJob {
/**
* insert 100,000 cost 21035 ms
*
* @param args
*/
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("SparkPutByMap")
val context = new SparkContext(conf)
val hbaseConf =()
(TableOutputFormat.OUTPUT_TABLE, "test_table")
//IMPORTANT: must set the attribute to solve the problem (can't create path from null string )
("", "/tmp")
val job = (hbaseConf)
(classOf[TableOutputFormat[ImmutableBytesWritable]])
(classOf[ImmutableBytesWritable])
(classOf[Put])
try{
val rdd = (1 to 100000)
// column family
val family = ("cf")
// column counter --> ctr
val column = ("ctr")
(value => {
var put = new Put((value))
(family, column, (value))
(new ImmutableBytesWritable(), put)
})
.saveAsNewAPIHadoopDataset()
}finally{
()
}
}
}
这是spark的一个bug,具体信息查看:
/jira/browse/SPARK-21549
解决方案:
//IMPORTANT: must set the attribute to solve the problem (can't create path from null string )
("", "/tmp")
参考信息:
/hortonworks-spark/shc/issues/15