Python numpy: Convert string in to numpy array

后端 未结 4 563
傲寒
傲寒 2020-12-19 11:38

I have following String that I have put together:

v1fColor = \'2,4,14,5,0,0,0,0,0,0,0,0,0,0,12,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,15,6,0,0,0,0,1,0,0,0,0,0,0,0,0,0         


        
相关标签:
4条回答
  • 2020-12-19 12:02

    You can do this:

    lst = v1fColor.split(',')  #create a list of strings, splitting on the commas.
    v1fColor = NP.array( lst, dtype=NP.uint8 ) #numpy converts the strings.  Nifty!
    

    or more concisely:

    v1fColor = NP.array( v1fColor.split(','), dtype=NP.uint8 )
    

    Note that it is a little more customary to do:

    import numpy as np
    

    compared to import numpy as NP

    EDIT

    Just today I learned about the function numpy.fromstring which could also be used to solve this problem:

    NP.fromstring( "1,2,3" , sep="," , dtype=NP.uint8 )
    
    0 讨论(0)
  • 2020-12-19 12:05

    You can do this without using python string methods -- try numpy.fromstring:

    >>> numpy.fromstring(v1fColor, dtype='uint8', sep=',')
    array([ 2,  4, 14,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 12,  4,  0,
            0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 15,  6,  0,  0,
            0,  0,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0, 20,  9,  0,  0,  0,
            2,  2,  0,  0,  0,  0,  0,  0,  0,  0,  0, 13,  6,  0,  0,  0,  1,
            0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 10,  8,  0,  0,  0,  1,  2,
            0,  0,  0,  0,  0,  0,  0,  0,  0, 17, 17,  0,  0,  0,  3,  6,  0,
            0,  0,  0,  0,  0,  0,  0,  0,  7,  5,  0,  0,  0,  2,  0,  0,  0,
            0,  0,  0,  0,  0,  0,  0,  4,  3,  0,  0,  0,  1,  1,  0,  0,  0,
            0,  0,  0,  0,  0,  0,  6,  6,  0,  0,  0,  2,  3], dtype=uint8)
    
    0 讨论(0)
  • 2020-12-19 12:09

    I am writing this answer so if for any future references: I am not sure what is the correct solution in this case but I think What @David Robinson initially publish was the correct answer due to one reason: Cosine Similarity values can not be greater than one and when I use NP.array(v1fColor.split(","), dtype=NP.uint8) option I get strage values which are above 1.0 for cosine similarity between two vectors.

    So I wrote a simple sample code to try out:

    import numpy as np
    import numpy.linalg as LA
    
    def testFunction():
        value1 = '2,3,0,80,125,15,5,0,0,0,0,0,0,0,0,0,0,0,0,0,2,4,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
        value2 = '2,137,0,4,96,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0'
        cx = lambda a, b : round(np.inner(a, b)/(LA.norm(a)*LA.norm(b)), 3)
        #v1fColor = np.array(map(int,value1.split(',')))
        #v2fColor =  np.array(map(int,value2.split(',')))
        v1fColor = np.array( value1.split(','), dtype=np.uint8 )
        v2fColor = np.array( value2.split(','), dtype=np.uint8 )
        print v1fColor
        print v2fColor
        cosineValue = cx(v1fColor, v2fColor)
        print cosineValue
    
    if __name__ == '__main__':
        testFunction()
    

    if you run this code you should get the following output: enter image description here

    Not lets un commented two lines that and run the code with the David's Initial Solution:

    v1fColor = np.array(map(int,value1.split(',')))
    v2fColor =  np.array(map(int,value2.split(','))) 
    

    Keep in mind as you see above Cosine Similarity Value came up above 1.0 but when we use the map function and use do the int casting we get the following value which is the correct value:

    enter image description here

    Luckily I was plotting the values that I was initially getting and some of the cosine values came above 1.0 and I took the outputs of these vectors and manually typed it in python console, and send it via my lambda function and got the correct answer so I was very confuse. Then I wrote the test script to see whats going on and glad I caught this issue. I am not a python expert to exactly tell what is going on in two methods to give two different answers. But I leave that to either @David Robinson or @mgilson.

    0 讨论(0)
  • 2020-12-19 12:11

    You have to split the string by its commas first:

    NP.array(v1fColor.split(","), dtype=NP.uint8)
    
    0 讨论(0)
提交回复
热议问题