writing SparkRDD to a HBase table using Scala

こ雲淡風輕ζ 提交于 2019-12-06 07:19:50

For example, the below method takes Int as argument and returns Double

var toDouble: (Int) => Double = a => {
    a.toDouble
}

You can use toDouble(2) and it returns 2.0

The same way you can convert your method to function literal as below.

val convert: (Int) => Tuple2[ImmutableBytesWritable,Put] = a => {
              val p = new Put(Bytes.toBytes(a))
              p.add(Bytes.toBytes("columnfamily"),
              Bytes.toBytes("col_1"), Bytes.toBytes(a))
              new Tuple2[ImmutableBytesWritable,Put](new ImmutableBytesWritable(a.toString.getBytes()), p);
         }

The thing you are doing wrong here is defining the convert inside main If you write this code in this way it may work :

    object HBaseWrite {
       def main(args: Array[String]) {
         val sparkConf = new SparkConf().setAppName("HBaseWrite").setMaster("local").set("spark.driver.allowMultipleContexts","true").set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
         val sc = new SparkContext(sparkConf)
         val conf = HBaseConfiguration.create()
         val outputTable = "tablename"

         System.setProperty("user.name", "hdfs")
         System.setProperty("HADOOP_USER_NAME", "hdfs")
         conf.set("hbase.master", "localhost:60000")
         conf.setInt("timeout", 120000)
         conf.set("hbase.zookeeper.quorum", "localhost")
         conf.set("zookeeper.znode.parent", "/hbase-unsecure")
         conf.setInt("hbase.client.scanner.caching", 10000)
         sparkConf.registerKryoClasses(Array(classOf[org.apache.hadoop.hbase.client.Result]))
         val jobConfig: JobConf = new JobConf(conf,this.getClass)
         jobConfig.setOutputFormat(classOf[TableOutputFormat])
         jobConfig.set(TableOutputFormat.OUTPUT_TABLE,outputTable)
         val x = 12
         val y = 15
         val z = 25
         var newarray = Array(x,y,z)
         val newrddtohbase = sc.parallelize(newarray)
         val convertFunc = convert _
         new PairRDDFunctions(newrddtohbase.map(convertFunc)).saveAsHadoopDataset(jobConfig)
         sc.stop()
       }
       def convert(a:Int) : Tuple2[ImmutableBytesWritable,Put] = {
              val p = new Put(Bytes.toBytes(a))
              p.add(Bytes.toBytes("columnfamily"),
              Bytes.toBytes("col_1"), Bytes.toBytes(a))
              new Tuple2[ImmutableBytesWritable,Put](new ImmutableBytesWritable(a.toString.getBytes()), p);
         }
    }

P.S.: The code is not tested , but it should work !

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!