using genfromtxt to split data

限于喜欢 提交于 2019-12-11 01:26:37

问题


With Python, I am using genfromtxt (from numpy) to read in a text file into an array:

y = np.genfromtxt("1400list.txt", dtype=[('mystring','S20'),('myfloat','float')])

Which works okay, except it doesn't seem to read my 2 columns into a 2D array. I am getting:

[('string001', 123.0),('string002', 456.0),('string002', 789.0)]

But I think would like:

[['string001', 123.0],['string002', 456.0],['string002', 789.0]]

I basically want each piece of information as a separate element that I can then manipulate.


回答1:


What genfromtxt returns is called a structured array. It gives a 1d array of tuples, each tuple has the dtype that you specified.

These are actually very useful once you learn how to use them. You cannot have a 2d array with floats and strings, but with a structured array, you can!

For example:

import numpy as np
from StringIO import StringIO
s = """string001 123
       string002 456
       string002 789"""
f = StringIO(s)
y = np.genfromtxt(f, dtype=[('mystring', 'S20'), ('myfloat', float)])

Which is what you have so far. Now you can access y in the following fashion. You can use a field name to get a column as a 1d array:

>>> y['mystring']
array(['string001', 'string002', 'string002'], 
  dtype='|S20')

>>> y['myfloat']
array([ 123.,  456.,  789.])

Note that y['myfloat'] gives floats because of the dtype argument, even though in the file they are ints.

Or, you can use an integer to get a row as a tuple with the given dtype:

>>> y[1]
('string002', 456.0)

If you are doing a lot of manipulation of data structures like this, you might want to look into pandas



来源:https://stackoverflow.com/questions/12291665/using-genfromtxt-to-split-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!