问题
I have this code, which is supposed to read a single column data from a parquet file stored on S3:
fs = s3fs.S3FileSystem()
data_set = pq.ParquetDataset(f"s3://{bucket}/{key}", filesystem=fs)
column_data = data_set.read(columns=[col_name])
and I get this excption: validate_schemas self.schema = self.pieces[0].get_metadata(open_file).schema IndexError: list index out of range
I upgraded to the latest version of pyarrow but it did not help
来源:https://stackoverflow.com/questions/52057964/error-opening-a-parquet-file-on-amazon-s3-using-pyarrow