I need to read parquet files from multiple paths that are not parent or child directories.
for example,
dir1 --- | ------- dir1_1
Both the parquetFile method of SQLContext and the parquet method of DataFrameReader take multiple paths. So either of these works:
SQLContext
DataFrameReader
df = sqlContext.parquetFile('/dir1/dir1_2', '/dir2/dir2_1')
or
df = sqlContext.read.parquet('/dir1/dir1_2', '/dir2/dir2_1')