Spark: Difference between numPartitions in read.jdbc(..numPartitions..) and repartition(..numPartitions..)
问题 I'm perplexed between the behaviour of numPartitions parameter in the following methods: DataFrameReader.jdbc Dataset.repartition The official docs of DataFrameReader.jdbc say following regarding numPartitions parameter numPartitions : the number of partitions. This, along with lowerBound (inclusive), upperBound (exclusive), form partition strides for generated WHERE clause expressions used to split the column columnName evenly. And official docs of Dataset.repartition say Returns a new