How to append keys to values for {Key,Value} pair RDD and How to convert it to an rdd? [duplicate]

青春壹個敷衍的年華 提交于 2019-12-08 11:42:20

问题


Suppose i am having 2 files in file1,file2 in dataset directory:

val file = sc.wholeTextFiles("file:///root/data/dataset").map((x,y) => y + "," + x)

in the Above code i am trying to get an rdd having values:-> value,key as single value into rdd

suppose filename is file1 and say 2 records:

file1:

1,30,ssr

2,43,svr

And

file2:

1,30,psr

2,43,pvr

The desired rdd output is:

(1,30,ssr,file1),(2,43,svr,file1),(1,30,psr,file2),(2,43,pvr,file2)

Can we achieve this? if possible Please Help me!


回答1:


var files = sc.wholeTextFiles("file:///root/data/dataset")

var yourNeededRdd = files
  .flatMap({
    case (filePath, fileContent) => {
      val fileName = filePath.split('/).last
      fileContent.split("\n").map(line => line + "," + fileName)
    }
  })


来源:https://stackoverflow.com/questions/38898860/how-to-append-keys-to-values-for-key-value-pair-rdd-and-how-to-convert-it-to-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!