first question here. I\'ll try to be concise.
I am generating multiple arrays containing feature information for a machine learning application. As the arrays do not
If you need to save your data in a structured way, you should consider using the HDF5 file format (http://www.hdfgroup.org/HDF5/). It is very flexible, easy to use, efficient, and other software might already support it (HDFView, Mathematica, Matlab, Origin..). There is a simple python binding called h5py.
You can store datasets in a filesystem like structure and define attributes for each dataset, like a dictionary. For example:
import numpy as np
import h5py
# some data
table1 = np.array([(1,1), (2,2), (3,3)], dtype=[('x', float), ('y', float)])
table2 = np.ones(shape=(3,3))
# save to data to file
h5file = h5py.File("test.h5", "w")
h5file.create_dataset("Table1", data=table1)
h5file.create_dataset("Table2", data=table2, compression=True)
# add attributes
h5file["Table2"].attrs["attribute1"] = "some info"
h5file["Table2"].attrs["attribute2"] = 42
h5file.close()
Reading the data is also simple, you can even load just a few elements out of a large file if you want:
h5file = h5py.File("test.h5", "r")
# read from file (numpy-like behavior)
print h5file["Table1"]["x"][:2]
# read everything into memory (real numpy array)
print np.array(h5file["Table2"])
# read attributes
print h5file["Table2"].attrs["attribute1"]
More features and possibilities are found in the documentation and on the websites (the Quick Start Guide might be of interest).