Say I have the following variables and its corresponding values which represents a record.
name = \'abc\'
age = 23
we
Given http://wiki.python.org/moin/TimeComplexity how about this:
AGE, NAME, etc. AGE, NAME) be possible values for given column (35 or "m"). VALUES = [ [35, "m"], ...]AGE, NAME) be lists of indices from the VALUES list.VALUES so that you know that first column is age and second is sex (you could avoid that and use dictionaries, but they introduce large memory footrpint and with over 100K objects this may or not be a problem).Then the retrieve function could look like this:
def retrieve(column_name, column_value):
if column_name == "age":
return [VALUES[index] for index in AGE[column_value]]
elif ...: # repeat for other "columns"
Then, this is what you get
VALUES = [[35, "m"], [20, "f"]]
AGE = {35:[0], 20:[1]}
SEX = {"m":[0], "f":[1]}
KEYS = ["age", "sex"]
retrieve("age", 35)
# [[35, 'm']]
If you want a dictionary, you can do the following:
[dict(zip(KEYS, values)) for values in retrieve("age", 35)]
# [{'age': 35, 'sex': 'm'}]
but again, dictionaries are a little heavy on the memory side, so if you can go with lists of values it might be better.
Both dictionary and list retrieval are O(1) on average - worst case for dictionary is O(n) - so this should be pretty fast. Maintaining that will be a little bit of pain, but not so much. To "write", you'd just have to append to the VALUES list and then append the index in VALUES to each of the dictionaries.
Of course, then best would be to benchmark your actual implementation and look for potential improvements, but hopefully this make sense and will get you going :)
EDIT:
Please note that as @moooeeeep said, this will only work if your values are hashable and therefore can be used as dictionary keys.