Fastest way to store a numpy array in redis

坚强是说给别人听的谎言 提交于 2019-12-01 06:30:45

I don't know if it is fastest, but you could try something like this...

Storing a Numpy array to Redis goes like this - see function toRedis():

  • get shape of Numpy array and encode
  • append the Numpy array as bytes to the shape
  • store the encoded array under supplied key

Retrieving a Numpy array goes like this - see function fromRedis():

  • retrieve from Redis the encoded string corresponding to supplied key
  • extract the shape of the Numpy array from the string
  • extract data and repopulate Numpy array, reshape to original shape

#!/usr/bin/env python3

import struct
import redis
import numpy as np

def toRedis(r,a,n):
   """Store given Numpy array 'a' in Redis under key 'n'"""
   h, w = a.shape
   shape = struct.pack('>II',h,w)
   encoded = shape + a.tobytes()

   # Store encoded data in Redis
   r.set(n,encoded)
   return

def fromRedis(r,n):
   """Retrieve Numpy array from Redis key 'n'"""
   encoded = r.get(n)
   h, w = struct.unpack('>II',encoded[:8])
   a = np.frombuffer(encoded, dtype=np.uint16, offset=8).reshape(h,w)
   return a

# Create 80x80 numpy array to store
a0 = np.arange(6400,dtype=np.uint16).reshape(80,80) 

# Redis connection
r = redis.Redis(host='localhost', port=6379, db=0)

# Store array a0 in Redis under name 'a0array'
toRedis(r,a0,'a0array')

# Retrieve from Redis
a1 = fromRedis(r,'a0array')

np.testing.assert_array_equal(a0,a1)

You could add more flexibility by encoding the dtype of the Numpy array along with the shape. I didn't do that because it may be the case that you already know all your arrays are of one specific type and then the code would just be bigger and harder to read for no reason.

Rough benchmark on modern iMac:

80x80 Numpy array of np.uint16   => 58 microseconds to write
200x200 Numpy array of np.uint16 => 88 microseconds to write

Keywords: Python, Numpy, Redis, array, serialise, serialize, key, incr, unique

The tobytes() function is not very storage efficient. In order to decrease the storage which has to be written to the redis server, you can use the base64 package:

def encode_vector(ar):
    return base64.encodestring(ar.tobytes()).decode('ascii')

def decode_vector(ar):
    return np.fromstring(base64.decodestring(bytes(ar.decode('ascii'), 'ascii')), dtype='uint16')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!