I need to read contiguous files in pySpark. The following works for me.
from pyspark.sql import SQLContext file = \"events.parquet/exportDay=2015090[1-7]\"
It uses shell globbing, I believe.
The post: How to read multiple text files into a single RDD?
Seems to suggest the below should work.
"events.parquet/exportDay=2015090[89],events.parquet/exportDay=2015091[0-4]"