Dataframe sample in Apache spark | Scala

后端 未结 7 2106
北海茫月
北海茫月 2020-12-05 07:20

I\'m trying to take out samples from two dataframes wherein I need the ratio of count maintained. eg

df1.count() = 10
df2.count() = 1000

noOfSamples = 10
         


        
7条回答
  •  一生所求
    2020-12-05 07:44

    May be you want to try below code..

    val splits = data.randomSplit(Array(0.7, 0.3))
    val (trainingData, testData) = (splits(0), splits(1))
    

提交回复
热议问题