Share a dict with multiple Python scripts

后端未结

关注

 8  1163

予麋鹿

I\'d like a unique dict (key/value) database to be accessible from multiple Python scripts running at the same time.

If script1.py updates

Example

script1.py

class SCRIPT1:

    def __init__(self, dictionary):
        self.dictionary = dictionary
        self.dictionary.update({"a":"a"})
        print("SCRIPT1 : ", self.dictionary)

    def update(self):
        self.dictionary.update({"c":"c"})

script2.py

class SCRIPT2:
    def __init__(self, dictionary):
        self.dictionary = dictionary
        self.dictionary.update({"b":"b"})
        print("SCRIPT 2 : " , self.dictionary)

main_script.py

import script1
import script2

x = {}

obj1 = script1.SCRIPT1(x) # output: SCRIPT1 :  {'a': 'a'}
obj2 = script2.SCRIPT2(x) # output: SCRIPT 2 :  {'a': 'a', 'b': 'b'}
obj1.update()
print("SCRIPT 1 dict: ", obj1.dictionary) # output: SCRIPT 1 dict:  {'c': 'c', 'a': 'a', 'b': 'b'}

print("SCRIPT 2 dict: ", obj2.dictionary) # output: SCRIPT 2 dict:  {'c': 'c', 'a': 'a', 'b': 'b'}

Also create an empty _ init _.py file in the directory where you will run the scripts.

Another option is:

Redis

0 讨论(0)

清歌不尽

2020-12-14 12:06

CodernintyDB could be worth exploring, using the server version.

http://labs.codernity.com/codernitydb/

Server version: http://labs.codernity.com/codernitydb/server.html

0 讨论(0)
发布评论:

提交评论
- 加载中...
自闭症患者

2020-12-14 12:07

You could use a document-based database manager. Maybe is too heavy for your system to do so, but concurrent access is typically one of the reasons DB management systems and API to connect to them are in place.

I have used MongoDB with Python and it works fine. The Python API documentation is quite good and each document (element of the database) is a dictionary that can be loaded to python as such.

0 讨论(0)
发布评论:

提交评论
- 加载中...

离开以前

2020-12-14 12:11

Mose of embedded datastore other than SQLite doesn't have optimization for concurrent access, I was also curious about SQLite concurrent performance too, so I did a benchmark:

import time
import sqlite3
import os
import random
import sys
import multiprocessing


class Store():

    def __init__(self, filename='kv.db'):
        self.conn = sqlite3.connect(filename, timeout=60)
        self.conn.execute('pragma journal_mode=wal')
        self.conn.execute('create table if not exists "kv" (key integer primary key, value integer) without rowid')
        self.conn.commit()

    def get(self, key):
        item = self.conn.execute('select value from "kv" where key=?', (key,))
        if item:
            return next(item)[0]

    def set(self, key, value):
        self.conn.execute('replace into "kv" (key, value) values (?,?)', (key, value))
        self.conn.commit()


def worker(n):
    d = [random.randint(0, 1<<31) for _ in range(n)]
    s = Store()
    for i in d:
        s.set(i, i)
    random.shuffle(d)
    for i in d:
        s.get(i)


def test(c):
    n = 5000
    start = time.time()
    ps = []
    for _ in range(c):
        p = multiprocessing.Process(target=worker, args=(n,))
        p.start()
        ps.append(p)
    while any(p.is_alive() for p in ps):
        time.sleep(0.01)
    cost = time.time() - start
    print(f'{c:<10d}\t{cost:<7.2f}\t{n/cost:<20.2f}\t{n*c/cost:<14.2f}')


def main():
    print(f'concurrency\ttime(s)\tpre process TPS(r/s)\ttotal TPS(r/s)')
    for c in range(1, 9):
        test(c)


if __name__ == '__main__':
    main()

result on my 4 cores macOS box, SSD volume:

concurrency time(s) pre process TPS(r/s)    total TPS(r/s)
1           0.65    7638.43                 7638.43
2           1.30    3854.69                 7709.38
3           1.83    2729.32                 8187.97
4           2.43    2055.25                 8221.01
5           3.07    1629.35                 8146.74
6           3.87    1290.63                 7743.78
7           4.80    1041.73                 7292.13
8           5.37    931.27                  7450.15

result on an 8 cores windows server 2012 cloud server, SSD volume:

concurrency     time(s) pre process TPS(r/s)    total TPS(r/s)
1               4.12    1212.14                 1212.14
2               7.87    634.93                  1269.87
3               14.06   355.56                  1066.69
4               15.84   315.59                  1262.35
5               20.19   247.68                  1238.41
6               24.52   203.96                  1223.73
7               29.94   167.02                  1169.12
8               34.98   142.92                  1143.39

turns out overall throughput is consistent regardless of concurrency, and SQLite is slower on windows than macOS, hope this is helpful.

As SQLite write lock is database wise, in order to get more TPS, you could partition data to multi-database files:

class MultiDBStore():

    def __init__(self, buckets=5):
        self.buckets = buckets
        self.conns = []
        for n in range(buckets):
            conn = sqlite3.connect(f'kv_{n}.db', timeout=60)
            conn.execute('pragma journal_mode=wal')
            conn.execute('create table if not exists "kv" (key integer primary key, value integer) without rowid')
            conn.commit()
            self.conns.append(conn)

    def _get_conn(self, key):
        assert isinstance(key, int)
        return self.conns[key % self.buckets]

    def get(self, key):
        item = self._get_conn(key).execute('select value from "kv" where key=?', (key,))
        if item:
            return next(item)[0]

    def set(self, key, value):
        conn = self._get_conn(key)
        conn.execute('replace into "kv" (key, value) values (?,?)', (key, value))
        conn.commit()

result on my mac with 20 partitions:

concurrency time(s) pre process TPS(r/s)    total TPS(r/s)
1           2.07    4837.17                 4837.17
2           2.51    3980.58                 7961.17
3           3.28    3047.68                 9143.03
4           4.02    2486.76                 9947.04
5           4.44    2249.94                 11249.71
6           4.76    2101.26                 12607.58
7           5.25    1903.69                 13325.82
8           5.71    1752.46                 14019.70

total TPS is higher than single database file.

0 讨论(0)

南笙

2020-12-14 12:11

Before there was redis there was Memcached (which works on windows). Here is a tutorial. https://realpython.com/blog/python/python-memcache-efficient-caching/

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页