Dataframe sample in Apache spark | Scala

后端 未结 7 2123
北海茫月
北海茫月 2020-12-05 07:20

I\'m trying to take out samples from two dataframes wherein I need the ratio of count maintained. eg

df1.count() = 10
df2.count() = 1000

noOfSamples = 10
         


        
7条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-12-05 07:57

    To answer if the fraction can be greater than 1. Yes, it can be if we have replace as yes. If a value greater than 1 is provided with replace false, then following exception will occur:

    java.lang.IllegalArgumentException: requirement failed: Upper bound (2.0) must be <= 1.0.
    

提交回复
热议问题