I am trying to find out values stored in a list of keys which match a pattern from redis. I tried using SCAN
so that later on i can use MGET
to get all the values, The problem is:
SCAN 0 MATCH "foo:bar:*" COUNT 1000
does not return any value whereas
SCAN 0 MATCH "foo:bar:*" COUNT 10000
returns the desired keys. How do i force SCAN
to look through all the existing keys? Do I have to look into lua for this?
With the code bellow you will scan the 1000 first object from cursor 0
SCAN 0 MATCH "foo:bar:*" COUNT 1000
In result you will get a new cursor to recall
SCAN YOUR_NEW_CURSOR MATCH "foo:bar:*" COUNT 1000
To scan 1000 next object. Then when you inscrease COUNT from 1000 to 1000 and retrieve data you scan more keys then in your case match more keys.
To scan the entire list you need to recall SCAN until the cursor give in response return zero (i.e entire scan)
Use INFO command to get your amount of keys like
db0:keys=YOUR_AMOUNT_OF_KEYS,expires=0,avg_ttl=0
Then call
SCAN 0 MATCH "foo:bar:*" COUNT YOUR_AMOUNT_OF_KEYS
Just going to put this here for anyone interested in how to do it using the python redis
library:
import redis redis_server = redis.StrictRedis(host=settings.redis_ip, port=6379, db=0) mid_results = [] cur, results = redis_server.scan(0,'foo:bar:*',1000) mid_results += results while cur != 0: cur, results = redis_server.scan(cur,'foo:bar:*',1000) mid_results += results final_uniq_results = set(mid_results)
It took me a few days to figure this out, but basically each scan
will return a tuple.
Examples:
(cursor, results_list) (5433L, [... keys here ...]) (3244L, [... keys here, maybe ...]) (6543L, [... keys here, duplicates maybe too ...]) (0L, [... last items here ...])
- Keep scanning
cursor
until it returns to 0
. - There is a guarantee it will return to
0
. - Even if the scan returns an empty
results_list
between scans.
I had a hard time figuring out what the cursor number was and why I would randomly get an empty list, or repeated items, but even though I knew I had just put items in.
After reading:
It made more sense, but still there is some deep programming magic and compromises happening to iterate the sets.