pickle

Python: storing big data structures

北慕城南 提交于 2019-12-20 04:23:01
问题 I'm currently doing a project in python that uses dictionaries that are relatively big (around 800 MB). I tried to store one of this dictionaries by using pickle, but got an MemoryError. What is the proper way to save this kind of files in python? Should I use a database? 回答1: Python-standard shelve module provides dict-like interface for persistent objects. It works with many database backends and is not limited by RAM. The advantage of using shelve over direct work with databases is that

Model persistence in Scikit-Learn?

荒凉一梦 提交于 2019-12-20 02:28:25
问题 I am trying to save and load scikit-learn model but facing issues when the save and load are happening on different python versions. Here what I have tried: Using pickle to save a model in python3 and deserialize in python2.This works for some of the models like LR,SVM but it fails for KNN. >>> pickle.load(open("inPy3.pkl", 'rb')) #KNN model ValueError: non-string names in Numpy dtype unpickling Also , I tried to serialize and deserialize in json using jsonpickle but getting the following

Pickling a class definition

流过昼夜 提交于 2019-12-19 16:48:13
问题 Is there a way to pickle a class definition? What I'd like to do is pickle the definition (which may created dynamically), and then send it over a TCP connection so that an instance can be created on the other end. I understand that there may be dependencies, like modules and global variables that the class relies on. I'd like to bundle these in the pickling process as well, but I'm not concerned about automatically detecting the dependencies because it's okay if the onus is on the user to

How do I prevent memory leak when I load large pickle files in a for loop?

醉酒当歌 提交于 2019-12-19 13:58:54
问题 I have 50 pickle files that are 0.5 GB each. Each pickle file is comprised of a list of custom class objects. I have no trouble loading the files individually using the following function: def loadPickle(fp): with open(fp, 'rb') as fh: listOfObj = pickle.load(fh) return listOfObj However, when I try to iteratively load the files I get a memory leak. l = ['filepath1', 'filepath2', 'filepath3', 'filepath4'] for fp in l: x = loadPickle(fp) print( 'loaded {0}'.format(fp) ) My memory overflows

How do I prevent memory leak when I load large pickle files in a for loop?

限于喜欢 提交于 2019-12-19 13:57:33
问题 I have 50 pickle files that are 0.5 GB each. Each pickle file is comprised of a list of custom class objects. I have no trouble loading the files individually using the following function: def loadPickle(fp): with open(fp, 'rb') as fh: listOfObj = pickle.load(fh) return listOfObj However, when I try to iteratively load the files I get a memory leak. l = ['filepath1', 'filepath2', 'filepath3', 'filepath4'] for fp in l: x = loadPickle(fp) print( 'loaded {0}'.format(fp) ) My memory overflows

namespace on python pickle

自古美人都是妖i 提交于 2019-12-19 06:26:07
问题 I got an error when I use pickle with unittest. I wrote 3 program files: for a class to be pickled, for a class which use class in #1, unittest for testing class in #2. and the real codes are as follows respectively. #1. ClassToPickle.py import pickle class ClassToPickle(object): def __init__(self, x): self.x = x if __name__=="__main__": p = ClassToPickle(10) pickle.dump(p, open('10.pickle', 'w')) #2. SomeClass.py from ClassToPickle import ClassToPickle import pickle class SomeClass(object):

Pickle dump with progress bar

笑着哭i 提交于 2019-12-19 05:59:17
问题 I've a really big json object that I want to dump into a pickle file. Is there a way to display a progress bar while using pickle.dump ? 回答1: The only way that I know of is to define getstate/setstate methods to return "sub objects" which can refresh the GUI when the get pickled/unpickled. For example, if your object is a list, you could use something like this: import pickle class SubList: on_pickling = None def __init__(self, sublist): print('SubList', sublist) self.data = sublist def _

Store object using Python pickle, and load it into different namespace

你离开我真会死。 提交于 2019-12-19 05:47:47
问题 I'd like to pass object state between two Python programs (one is my own code running standalone, one is a Pyramid view), and different namespaces. Somewhat related questions are here or here, but I can't quite follow through with them for my scenario. My own code defines a global class (i.e. __main__ namespace) of somewhat complexish structure: # An instance of this is a colorful mess of nested lists and sets and dicts. class MyClass : def __init__(self) : data = set() more = dict() ... def

Store object using Python pickle, and load it into different namespace

僤鯓⒐⒋嵵緔 提交于 2019-12-19 05:47:35
问题 I'd like to pass object state between two Python programs (one is my own code running standalone, one is a Pyramid view), and different namespaces. Somewhat related questions are here or here, but I can't quite follow through with them for my scenario. My own code defines a global class (i.e. __main__ namespace) of somewhat complexish structure: # An instance of this is a colorful mess of nested lists and sets and dicts. class MyClass : def __init__(self) : data = set() more = dict() ... def

Frequently Updating Stored Data for a Numerical Experiment using Python [closed]

心不动则不痛 提交于 2019-12-19 05:11:29
问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 5 years ago . I am running a numerical experiment that requires many iterations. After each iteration, I would like to store the data in a pickle file or pickle-like file in case the program times-out or a data structure becomes tapped. What is the best way to proceed. Here is the skeleton