'b' character added when using numpy loadtxt

佐手、 提交于 2019-12-03 00:48:53

np.loadtxt and np.genfromtxt operate in byte mode, which is the default string type in Python 2. But Python 3 uses unicode, and marks bytestrings with this b.

I tried some variations, in an python3 ipython session:

In [508]: np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[508]: b'    .--``--.'
In [509]: np.loadtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[509]: "b'    .--``--.'"
...
In [511]: np.genfromtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[511]: '.--``--.'
In [512]: np.genfromtxt('stack33655641.txt',dtype=None,delimiter='\n')[0]
Out[512]: b'.--``--.'
In [513]: np.genfromtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[513]: b'.--``--.'

genfromtxt with dtype=str gives the cleanest display - except it strips blanks. I may have to use a converter to turn that off. These functions are meant to read csv data where (white)spaces are separators, not part of the data.

loadtxt and genfromtxt are over kill for simple text like this. A plain file read does nicely:

In [527]: with open('stack33655641.txt') as f:a=f.read()
In [528]: print(a)
    .--``--.
.--`        `--.
|              |
|              |
`--.        .--`
    `--..--`

In [530]: a=a.splitlines()
In [531]: a
Out[531]: 
['    .--``--.',
 '.--`        `--.',
 '|              |',
 '|              |',
 '`--.        .--`',
 '    `--..--`']

(my text editor is set to strip trailing blanks, hence the ragged lines).


@DSM's suggestion:

In [556]: a=np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n').astype(str)
In [557]: a
Out[557]: 
array(['    .--``--.', '.--`        `--.', '|              |',
       '|              |', '`--.        .--`', '    `--..--`'], 
      dtype='<U16')
In [558]: a.tolist()
Out[558]: 
['    .--``--.',
 '.--`        `--.',
 '|              |',
 '|              |',
 '`--.        .--`',
 '    `--..--`']

You can use np.genfromtxt('your-file', dtype='U').

Kamalesh

Python3 is working with Unicode. I had the same issue when using loadtxt with dtype='S'. But using dtype='U as Unicode string in both numpy.loadtxt or numpy.genfromtxt, it will give output without b

a=numpy.loadtxt('filename',dtype={'names':('col1','col2','col3'),'formats':('U10','U10','i4')},delimiter=',')

print(a)

This is probably not the most 'pythonic' or best solution, but definitely gets the job done using numpy.loadtxt in python3. I am aware that it is a "dirty" solution, but it works for me.

import numpy as np
def loadstr(filename):
    dat = np.loadtxt(filename, dtype=str)
    for i in range(0,np.size(dat[:,0])):
        for j in range(0,np.size(dat[0,:])):
            mystring = dat[i,j]
            tick = len(mystring) - 1 
            dat[i,j] = mystring[2:tick]

    return (dat)

data = loadstr("somefile.txt")

This will import a 2D array from a text file via numpy, strip off the "b'" and "'" from the beginning and end of each string, and return a stripped string array named "data".

Are there better ways? Probably.

Does this work? Yup. I use it enough that I've got this function in my own Python module.

I had the same issue and for me the simplest way turned out to use the csv library. You get your desired output by:

import csv
def loadFromCsv(filename):
    with open(filename,'r') as file:
        list=[elem for elem in csv.reader(file,delimiter='\n')]
    return list

a=loadFromCsv('tile')
print(a)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!