numpy reading a csv file to an numpy array

若如初见. 提交于 2021-01-29 18:41:13

问题


I am new to python and using numpy to read a csv into an array .So I used two methods:

Approach 1

train = np.asarray(np.genfromtxt(open("/Users/mac/train.csv","rb"),delimiter=","))

Approach 2

with open('/Users/mac/train.csv') as csvfile:
        rows = csv.reader(csvfile)
        for row in rows:
            newrow = np.array(row).astype(np.int)
            train.append(newrow)

I am not sure what is the difference between these two approaches? What is recommended to use?

I am not concerned which is faster since my data size is small but instead concerned more about differences in the resulting data type.


回答1:


You can use pandas also, it is better and simple to use.

import pandas as pd
import numpy as np

dataset = pd.read_csv('file.csv')
# get all headers in csv
values = list(dataset.columns.values)

# get the labels, assuming last row is labels in csv
y = dataset[values[-1:]]
y = np.array(y, dtype='float32')
X = dataset[values[0:-1]]
X = np.array(X, dtype='float32')



回答2:


So what is the difference in the result?

genfromtxt is the numpy csv reader. It returns an array. No need for an extra asarray.

The second expression is incomplete, looks like would produce a list of arrays, one for each line of the file. It uses the generic python csv reader which doesn't do much other than read a line and split it into strings.



来源:https://stackoverflow.com/questions/52252496/numpy-reading-a-csv-file-to-an-numpy-array

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!