问题
Suppose i am having 2 files in file1,file2 in dataset directory:
val file = sc.wholeTextFiles("file:///root/data/dataset").map((x,y) => y + "," + x)
in the Above code i am trying to get an rdd having values:-> value,key as single value into rdd
suppose filename is file1 and say 2 records:
file1:
1,30,ssr
2,43,svr
And
file2:
1,30,psr
2,43,pvr
The desired rdd output is:
(1,30,ssr,file1),(2,43,svr,file1),(1,30,psr,file2),(2,43,pvr,file2)
Can we achieve this? if possible Please Help me!
回答1:
var files = sc.wholeTextFiles("file:///root/data/dataset")
var yourNeededRdd = files
.flatMap({
case (filePath, fileContent) => {
val fileName = filePath.split('/).last
fileContent.split("\n").map(line => line + "," + fileName)
}
})
来源:https://stackoverflow.com/questions/38898860/how-to-append-keys-to-values-for-key-value-pair-rdd-and-how-to-convert-it-to-a