问题
With Python, I am using genfromtxt
(from numpy) to read in a text file into an array:
y = np.genfromtxt("1400list.txt", dtype=[('mystring','S20'),('myfloat','float')])
Which works okay, except it doesn't seem to read my 2 columns into a 2D array. I am getting:
[('string001', 123.0),('string002', 456.0),('string002', 789.0)]
But I think would like:
[['string001', 123.0],['string002', 456.0],['string002', 789.0]]
I basically want each piece of information as a separate element that I can then manipulate.
回答1:
What genfromtxt
returns is called a structured array. It gives a 1d array of tuples, each tuple has the dtype
that you specified.
These are actually very useful once you learn how to use them. You cannot have a 2d array with floats and strings, but with a structured array, you can!
For example:
import numpy as np
from StringIO import StringIO
s = """string001 123
string002 456
string002 789"""
f = StringIO(s)
y = np.genfromtxt(f, dtype=[('mystring', 'S20'), ('myfloat', float)])
Which is what you have so far. Now you can access y
in the following fashion. You can use a field name to get a column as a 1d array:
>>> y['mystring']
array(['string001', 'string002', 'string002'],
dtype='|S20')
>>> y['myfloat']
array([ 123., 456., 789.])
Note that y['myfloat']
gives float
s because of the dtype
argument, even though in the file they are int
s.
Or, you can use an integer to get a row as a tuple
with the given dtype
:
>>> y[1]
('string002', 456.0)
If you are doing a lot of manipulation of data structures like this, you might want to look into pandas
来源:https://stackoverflow.com/questions/12291665/using-genfromtxt-to-split-data