pretty printing numpy ndarrays using unicode characters

心已入冬 提交于 2019-11-27 06:16:06

问题


I have recently noticed that Python printing functionality is not consistent for NumPy ndarays. For example it prints a horizontal 1D array horizontally:

import numpy as np
A1=np.array([1,2,3])
print(A1)
#--> [1 2 3]

but a 1D horizontal array with redundant brackets vertically:

A2=np.array([[1],[2],[3]])
print(A2)
#--> [[1]
#     [2]
#     [3]]

a 1D vertical array horizontally:

A3=np.array([[1,2,3]])
print(A3)
#--> [[1 2 3]]

and a 2D array:

B=np.array([[11,12,13],[21,22,23],[31,32,32]])
print(B)
# --> [[11 12 13]
#      [21 22 23]
#      [31 32 32]]

where the first dimension is now vertical. It gets even worse for higher dimensions as all of them are printed vertically:

C=np.array([[[111,112],[121,122]],[[211,212],[221,222]]])
print(C)
#--> [[[111 112]
#      [121 122]]
#
#     [[211 212]
#      [221 222]]]

A consistent behavior in my opinion would be to print the even dimensions horizontally and odd ones vertically. Using Unicode characters it would be possible to format it nicely. I was wondering if it is possible to create a function to print above arrays as:

A1 --> [1 2 3]
A2 --> ┌┌─┐┌─┐┌─┐┐
       │ 1  2  3 │
       └└─┘└─┘└─┘┘
A3 --> ┌┌─┐┐ # \u250c\u2500\u2510 
       │ 1 │ # \u2502
       │ 2 │
       │ 3 │
       └└─┘┘ # \u2514\u2500\u2518 
B -->  ┌┌──┐┌──┐┌──┐┐ 
       │ 11  21  31 │
       │ 12  22  32 │
       │ 13  23  33 │
       └└──┘└──┘└──┘┘ 

C -->  ┌┌─────────┐┌─────────┐┐
       │ [111 112]  [211 212] │
       │ [121 122]  [221 222] │
       └└─────────┘└─────────┘┘ 

I found this gist which takes care of the different number of digits. I tried to prototype a recursive function to implement the above concept:

 def npprint(A):
     assert isinstance(A, np.ndarray), "input of npprint must be array like"
     if A.ndim==1 :
         print(A)
     else:
         for i in range(A.shape[1]):
             npprint(A[:,i]) 

It kinda works for A1, A2, A3 and B but not for C. I would appreciate if you could help me know how the npprint should be to achieve above output for arbitrary dimension numpy ndarrays?

P.S.1. In Jupyter environment one can use LaTeX \mathtools \underbracket and \overbracket in Markdown. Sympy's pretty printing functionality is also a great start point. It can use ASCII, Unicode, LaTeX...

P.S.2. I'm being told that there is indeed a consistency in the way ndarrays are being printed. however IMHO it is kind of wired and non-intuitive. Having a flexible pretty printing function could help a lot to display ndarrays in different forms.

P.S.3. Sympy guys have already considered both points I have mentioned here. their Matrix module is pretty consistent (A1 and A2 are the same) and they also have a pprint function which does kind of the same thing and I expect from npprint here.

P.S.4. For those who follow up this idea I have integrated everythin here in this Jupyter Notebook


回答1:


It was quite a revelation to me understanding numpy arrays are not anything like MATLAB matrices or multidimensional mathematical arrays I had in mind. They are rather homogeneous and uniform nested Python lists. I also understood that the first dimension of a numpy array is the most deepest/inner pairs of square brackets which is printed horizontally and then from there second dimension is printed vertically, Third vertically with a spaced line...

Anyways I thing having an ppring function (inspired by Sympy's naming convention) could help a lot. so I'm going to put a very bad implementation here hoping it will inspire other advanced Pythoners to come up with better solutions:

def pprint(A):
    if A.ndim==1:
        print(A)
    else:
        w = max([len(str(s)) for s in A]) 
        print(u'\u250c'+u'\u2500'*w+u'\u2510') 
        for AA in A:
            print(' ', end='')
            print('[', end='')
            for i,AAA in enumerate(AA[:-1]):
                w1=max([len(str(s)) for s in A[:,i]])
                print(str(AAA)+' '*(w1-len(str(AAA))+1),end='')
            w1=max([len(str(s)) for s in A[:,-1]])
            print(str(AA[-1])+' '*(w1-len(str(AA[-1]))),end='')
            print(']')
        print(u'\u2514'+u'\u2500'*w+u'\u2518')  

and the result is somewhat acceptable for 1D and 2D arrays:

B1=np.array([[111,122,133],[21,22,23],[31,32,33]])
pprint(B1)

#┌─────────────┐
# [111 122 133]
# [21  22  23 ]
# [31  32  33 ]
#└─────────────┘

this is indeed a very bad code, it only works for integers. hopefully others will come up with better solutions.

P.S.1. Eric Wieser has already implemented a very nice HTML prototype for IPython/Jupiter which can seen here:

You may follow the discussion on numpy mailing list here.

P.S.2. I also posted this idea here on Reddit.

P.S.3 I spent some time to extend the code to 3D dimensional arrays:

def ndtotext(A, w=None, h=None):
    if A.ndim==1:
        if w == None :
            return str(A)
        else:
            s= '['
            for i,AA in enumerate(A[:-1]):
                s += str(AA)+' '*(max(w[i],len(str(AA)))-len(str(AA))+1)
            s += str(A[-1])+' '*(max(w[-1],len(str(A[-1])))-len(str(A[-1]))) +'] '
    elif A.ndim==2:
        w1 = [max([len(str(s)) for s in A[:,i]])  for i in range(A.shape[1])]
        w0 = sum(w1)+len(w1)+1
        s= u'\u250c'+u'\u2500'*w0+u'\u2510' +'\n'
        for AA in A:
            s += ' ' + ndtotext(AA, w=w1) +'\n'    
        s += u'\u2514'+u'\u2500'*w0+u'\u2518'
    elif A.ndim==3:
        h=A.shape[1]
        s1=u'\u250c' +'\n' + (u'\u2502'+'\n')*h + u'\u2514'+'\n'
        s2=u'\u2510' +'\n' + (u'\u2502'+'\n')*h + u'\u2518'+'\n'
        strings=[ndtotext(a)+'\n' for a in A]
        strings.append(s2)
        strings.insert(0,s1)
        s='\n'.join(''.join(pair) for pair in zip(*map(str.splitlines, strings)))
    return s

and as an example:

shape = 4, 3, 6
B2=np.arange(np.prod(shape)).reshape(shape)
print(B2)
print(ndtotext(B2))        


[[[ 0  1  2  3  4  5]
  [ 6  7  8  9 10 11]
  [12 13 14 15 16 17]]

 [[18 19 20 21 22 23]
  [24 25 26 27 28 29]
  [30 31 32 33 34 35]]

 [[36 37 38 39 40 41]
  [42 43 44 45 46 47]
  [48 49 50 51 52 53]]

 [[54 55 56 57 58 59]
  [60 61 62 63 64 65]
  [66 67 68 69 70 71]]]
┌┌───────────────────┐┌───────────────────┐┌───────────────────┐┌───────────────────┐┐
│ [0  1  2  3  4  5 ]  [18 19 20 21 22 23]  [36 37 38 39 40 41]  [54 55 56 57 58 59] │
│ [6  7  8  9  10 11]  [24 25 26 27 28 29]  [42 43 44 45 46 47]  [60 61 62 63 64 65] │
│ [12 13 14 15 16 17]  [30 31 32 33 34 35]  [48 49 50 51 52 53]  [66 67 68 69 70 71] │
└└───────────────────┘└───────────────────┘└───────────────────┘└───────────────────┘┘



回答2:


In each of these cases, each instance of your final dimension is printed on a single line. There's nothing inconsistent here.

Try various forms of:

a = np.random.rand(5, 4, 3)
print(a)

Change the number of dimensions in a (e.g. by adding more integers separated by commas). You'll find that each time you print a, each row in the printed object will have k values, where k is the last integer in a's shape.



来源:https://stackoverflow.com/questions/53126305/pretty-printing-numpy-ndarrays-using-unicode-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!