Numpy recarray writes byte literals tags to my csv file?

二次信任 提交于 2019-12-13 08:50:04

问题


I used the following testcode

import numpy as np
import csv

data = np.zeros((3,),dtype=("S24,int,float"))
with open("testtest.csv", 'w', newline='') as f:
    writer = csv.writer(f,delimiter=',')
    for row in data:
        writer.writerow(row)

And the data in the csv file has b'' tags (byte literal tags) for the string components of the record array. What is the proper way to handle writing to csv of these record arrays and the best way to avoid having byte literal tags in my csv file?


回答1:


I think you are working with Python3 which uses unicode as the default string type. byte strings then get special b marking.

If I generate the data with unicode instead of bytes, this works:

In [654]: data1 = np.zeros((3,),dtype=("U24,int,float"))
In [655]: data1['f0']='xxx'  # more interesting string field
In [656]: with open('test.csv','w') as f:
    writer=csv.writer(f,delimiter=',')
    for row in data1:
        writer.writerow(row)
In [658]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0

np.savetxt does the same thing:

In [668]: np.savetxt('test.csv',data1,fmt='%s',delimiter=',')
In [669]: cat test.csv
xxx,0,0.0
xxx,0,0.0
xxx,0,0.0

The question is, can I work around this while keeping the S24 field? For example by opening the file as wb?

I explored this issue earlier in https://stackoverflow.com/a/27513196/901925 Trying to strip b' ' from my Numpy array

Looks like my solutions are to either decode the byte field, or to write to a byte file directly. Since your array has a mix of string and numeric fields, the decode solution is a bit more tedious.

data1 = data.astype('U24,i,f') # convert bytestring field to unicode

A helper function can be used to decode byte strings on the fly:

In [147]: fn = lambda row: [j.decode() if isinstance(j,bytes) else j for j in row]
In [148]: with open('test.csv','w') as f:
    writer=csv.writer(f,delimiter=',')
    for row in data:
        writer.writerow(fn(row))
   .....:         
In [149]: cat test.csv
xxx,0,0.0
yyy,0,0.0
zzz,0,0.0



回答2:


Do you need the data in all those three dtypes? Consider using the numpy.savetxt() on a numpy array of floats or integers.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html

data = np.zeros((3,3))
filename='foo'
np.savetxt(filename+".csv",data,fmt='%1.6e',delimiter=",")
#fmt='%1.6e' controls how the numbers are written to the text file. 
#E.g. use fmt='%d' for integers


来源:https://stackoverflow.com/questions/32660815/numpy-recarray-writes-byte-literals-tags-to-my-csv-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!