Python list of dictionaries search

后端 未结 21 2473
-上瘾入骨i
-上瘾入骨i 2020-11-22 09:41

Assume I have this:

[
{\"name\": \"Tom\", \"age\": 10},
{\"name\": \"Mark\", \"age\": 5},
{\"name\": \"Pam\", \"age\": 7}
]

and by searchin

21条回答
  •  北荒
    北荒 (楼主)
    2020-11-22 10:09

    Have you ever tried out the pandas package? It's perfect for this kind of search task and optimized too.

    import pandas as pd
    
    listOfDicts = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5},
    {"name": "Pam", "age": 7}
    ]
    
    # Create a data frame, keys are used as column headers.
    # Dict items with the same key are entered into the same respective column.
    df = pd.DataFrame(listOfDicts)
    
    # The pandas dataframe allows you to pick out specific values like so:
    
    df2 = df[ (df['name'] == 'Pam') & (df['age'] == 7) ]
    
    # Alternate syntax, same thing
    
    df2 = df[ (df.name == 'Pam') & (df.age == 7) ]
    

    I've added a little bit of benchmarking below to illustrate pandas' faster runtimes on a larger scale i.e. 100k+ entries:

    setup_large = 'dicts = [];\
    [dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
    { "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 })) for _ in range(25000)];\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(dicts);'
    
    setup_small = 'dicts = [];\
    dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
    { "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 }));\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(dicts);'
    
    method1 = '[item for item in dicts if item["name"] == "Pam"]'
    method2 = 'df[df["name"] == "Pam"]'
    
    import timeit
    t = timeit.Timer(method1, setup_small)
    print('Small Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_small)
    print('Small Method Pandas: ' + str(t.timeit(100)))
    
    t = timeit.Timer(method1, setup_large)
    print('Large Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_large)
    print('Large Method Pandas: ' + str(t.timeit(100)))
    
    #Small Method LC: 0.000191926956177
    #Small Method Pandas: 0.044392824173
    #Large Method LC: 1.98827004433
    #Large Method Pandas: 0.324505090714
    

提交回复
热议问题