Saving in a file an array or DataFrame together with other information

后端未结

关注

 6  606

误落风尘 2020-12-23 00:09

The statistical software Stata allows short text snippets to be saved within a dataset. This is accomplished either using notes and/or characteristics.

This is a fea

6条回答

暖寄归人 (楼主)

2020-12-23 00:56

jpp's answer is pretty comprehensive, just wanted to mention that as of pandas v22 parquet is very convenient and fast option with almost no drawbacks vs csv (accept perhaps the coffee break).

read parquet

write parquet

At time of writing you'll need to also

pip install pyarrow

In terms of adding information you have the metadata which is attached to the data

import pyarrow as pa
import pyarrow.parquet as pq
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.normal(size=(1000, 10)))

tab = pa.Table.from_pandas(df)

tab = tab.replace_schema_metadata({'here' : 'it is'})

pq.write_table(tab, 'where_is_it.parq')

pq.read_table('where_is_it.parq')

which then yield a table

Pyarrow table
0: double
1: double
2: double
3: double
4: double
5: double
6: double
7: double
8: double
9: double
__index_level_0__: int64
metadata
--------
{b'here': b'it is'}

To get this back to pandas:

tab.to_pandas()

0 讨论(0)

查看其它6个回答