问题
I have a simple thing to do, read some vectors and write them in a file.
The vectors are 1024 dimensional.
for emb in src:
print(len(emb[0].detach().cpu().numpy())) #--> prints 1024!
f.write(np.array2string(emb[0].detach().cpu().numpy(), separator=', ') + " \n")
My file looks like this:
[-0.18077464, -0.02889516, 0.33970496, ..., -0.28685367, 0.00343359,
-0.00380083]
[-0.04554039, 0.0891239 , 0.0457519 , ..., -0.02622034, 0.04410202,
-0.03626832]
[ 0.2415923 , 0.36748591, -0.10974079, ..., 0.06169772, 0.0134424 ,
0.01647076]
[ 0.019123 , 0.00409475, 0.03623311, ..., -0.13063622, 0.02434589,
0.00400023]
[ 0.01281842, 0.00028924, 0.03185712, ..., -0.062907 , 0.02143336,
-0.00206215]
[ 0.01748654, 0.00136842, -0.01337154, ..., -0.04148545, 0.00875527,
-0.03006736]
So, I just can't access my vectors, 1024 dimension is transformed to whatever 6 or 7 dimensional vector + .... :(
How can I write vectors to my file correctly?
Cheers :)
回答1:
The normal way of writing a 2d array to a text file (so it can be read back) is with np.savetxt
:
In [309]: src = np.random.rand(6,4)
In [310]: src
Out[310]:
array([[0.78756364, 0.11385762, 0.16631052, 0.10987765],
[0.59954504, 0.80417064, 0.22461205, 0.47827772],
[0.10993457, 0.11650874, 0.55887911, 0.71854456],
[0.53572426, 0.55055622, 0.25423811, 0.46038837],
[0.05418115, 0.50696182, 0.31515915, 0.65310375],
[0.81168653, 0.81063907, 0.95371101, 0.11875685]])
write:
In [311]: np.savetxt('test.txt', src, fmt='%10.6f',delimiter=',')
In [312]: cat test.txt
0.787564, 0.113858, 0.166311, 0.109878
0.599545, 0.804171, 0.224612, 0.478278
0.109935, 0.116509, 0.558879, 0.718545
0.535724, 0.550556, 0.254238, 0.460388
0.054181, 0.506962, 0.315159, 0.653104
0.811687, 0.810639, 0.953711, 0.118757
Test the loading:
In [314]: np.genfromtxt('test.txt',delimiter=',')
Out[314]:
array([[0.787564, 0.113858, 0.166311, 0.109878],
[0.599545, 0.804171, 0.224612, 0.478278],
[0.109935, 0.116509, 0.558879, 0.718545],
[0.535724, 0.550556, 0.254238, 0.460388],
[0.054181, 0.506962, 0.315159, 0.653104],
[0.811687, 0.810639, 0.953711, 0.118757]])
savetxt
does a formatted write, row by row, roughly like:
In [315]: fmt = ','.join(['%10.6f']*4)
In [316]: fmt
Out[316]: '%10.6f,%10.6f,%10.6f,%10.6f'
In [317]: for row in src:
...: print(fmt%tuple(row)) # f.write(...)
...:
0.787564, 0.113858, 0.166311, 0.109878
0.599545, 0.804171, 0.224612, 0.478278
0.109935, 0.116509, 0.558879, 0.718545
0.535724, 0.550556, 0.254238, 0.460388
0.054181, 0.506962, 0.315159, 0.653104
0.811687, 0.810639, 0.953711, 0.118757
In fact I can wrap that in file write:
In [318]: with open('test1.txt','w') as f:
...: for row in src:
...: print(fmt%tuple(row), file=f)
...:
In [319]: cat test1.txt
0.787564, 0.113858, 0.166311, 0.109878
0.599545, 0.804171, 0.224612, 0.478278
...
回答2:
The vectors are still 1024 dimensional, but the display is only showing a curtailed view of the array.
You can view the whole array by setting print options:
import numpy as np
np.set_printoptions(threshold=np.nan)
来源:https://stackoverflow.com/questions/53453828/numpy-array2string-just-writing-in-string