Print parquet schema using Spark Streaming
问题 Following is the extract of the scala code written to extract praquet files and print the schema and first few records from the Parquet file. But nothing is getting printed. val batchDuration = 2 val inputDir = "file:///home/samplefiles" val conf = new SparkConf().setAppName("gpParquetStreaming").setMaster("local[*]") val sc = new SparkContext(conf) sc.hadoopConfiguration.set("spark.streaming.fileStream.minRememberDuration", "600000") val ssc = new StreamingContext(sc, Seconds(batchDuration))