I plan to:
data using pyarrow (new to it). The idea is to get better performance and memory utilisation ( a