How can I serialize a numpy array while preserving matrix dimensions?

后端 未结 7 851
青春惊慌失措
青春惊慌失措 2020-12-04 13:14

numpy.array.tostring doesn\'t seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.

7条回答
  •  盖世英雄少女心
    2020-12-04 13:44

    EDIT: As one can read in the comments of the question this solution deals with "normal" numpy arrays (floats, ints, bools ...) and not with multi-type structured arrays.

    Solution for serializing a numpy array of any dimensions and data types

    As far as I know you can not simply serialize a numpy array with any data type and any dimension...but you can store its data type, dimension and information in a list representation and then serialize it using JSON.

    Imports needed:

    import json
    import base64
    

    For encoding you could use (nparray is some numpy array of any data type and any dimensionality):

    json.dumps([str(nparray.dtype), base64.b64encode(nparray), nparray.shape])
    

    After this you get a JSON dump (string) of your data, containing a list representation of its data type and shape as well as the arrays data/contents base64-encoded.

    And for decoding this does the work (encStr is the encoded JSON string, loaded from somewhere):

    # get the encoded json dump
    enc = json.loads(encStr)
    
    # build the numpy data type
    dataType = numpy.dtype(enc[0])
    
    # decode the base64 encoded numpy array data and create a new numpy array with this data & type
    dataArray = numpy.frombuffer(base64.decodestring(enc[1]), dataType)
    
    # if the array had more than one data set it has to be reshaped
    if len(enc) > 2:
         dataArray.reshape(enc[2])   # return the reshaped numpy array containing several data sets
    

    JSON dumps are efficient and cross-compatible for many reasons but just taking JSON leads to unexpected results if you want to store and load numpy arrays of any type and any dimension.

    This solution stores and loads numpy arrays regardless of the type or dimension and also restores it correctly (data type, dimension, ...)

    I tried several solutions myself months ago and this was the only efficient, versatile solution I came across.

提交回复
热议问题