I have a CSV file with below data :
1,2,5
2,4
2,3
I want to load them into a Dataframe having schema of string of array
The outpu
Below is the sample code in Java. You need to read your file using spark.read().text(String path) method and then call the split function.
import static org.apache.spark.sql.functions.split;
public class SparkSample {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("SparkSample")
.master("local[*]")
.getOrCreate();
//Read file
Dataset ds = spark.read().text("c://tmp//sample.csv").toDF("value");
ds.show(false);
Dataset ds1 = ds.select(split(ds.col("value"), ",")).toDF("new_value");
ds1.show(false);
ds1.printSchema();
}
}