python-collections

Is the defaultdict in Python's collections module really faster than using setdefault?

若如初见. 提交于 2019-12-04 08:20:51
问题 I've seen other Python programmers use defaultdict from the collections module for the following use case: from collections import defaultdict s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] def main(): d = defaultdict(list) for k, v in s: d[k].append(v) I've typically approached this problem by using setdefault instead: def main(): d = {} for k, v in s: d.setdefault(k, []).append(v) The docs do in fact claim that using defaultdict is faster, but I've seen the

Why doesn't OrderedDict use super?

淺唱寂寞╮ 提交于 2019-12-03 16:46:05
We can create an OrderedCounter trivially by using multiple inheritance: >>> from collections import Counter, OrderedDict >>> class OrderedCounter(Counter, OrderedDict): ... pass ... >>> OrderedCounter('Mississippi').items() [('M', 1), ('i', 4), ('s', 4), ('p', 2)] Correct me if I'm wrong, but this crucially relies on the fact that Counter uses super : class Counter(dict): def __init__(*args, **kwds): ... super(Counter, self).__init__() ... That is, the magic trick works because >>> OrderedCounter.__mro__ (__main__.OrderedCounter, collections.Counter, collections.OrderedDict, dict, object) The

Is the defaultdict in Python's collections module really faster than using setdefault?

ぃ、小莉子 提交于 2019-12-03 00:22:34
I've seen other Python programmers use defaultdict from the collections module for the following use case: from collections import defaultdict s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] def main(): d = defaultdict(list) for k, v in s: d[k].append(v) I've typically approached this problem by using setdefault instead: def main(): d = {} for k, v in s: d.setdefault(k, []).append(v) The docs do in fact claim that using defaultdict is faster , but I've seen the opposite to be true when testing myself: $ python -mtimeit -s "from withsetdefault import main; s = [('yellow

What is the most efficient way to sum a dict with multiple keys by one key?

天大地大妈咪最大 提交于 2019-12-02 14:01:40
问题 I have the following dict structure. product1 = {'product_tmpl_id': product_id, 'qty':product_uom_qty, 'price':price_unit, 'subtotal':price_subtotal, 'total':price_total, } And then a list of products, each item in the list is a dict with the above structure list_ = [product1,product2,product3,.....] I need to sum the item in the list, group by the key product_tmpl_id ... I'm using dictcollections but it only sum the qty key, I need to sum key except the product_tmpl_id which is the criteria

What is the most efficient way to sum a dict with multiple keys by one key?

落爺英雄遲暮 提交于 2019-12-02 07:18:02
I have the following dict structure. product1 = {'product_tmpl_id': product_id, 'qty':product_uom_qty, 'price':price_unit, 'subtotal':price_subtotal, 'total':price_total, } And then a list of products, each item in the list is a dict with the above structure list_ = [product1,product2,product3,.....] I need to sum the item in the list, group by the key product_tmpl_id ... I'm using dictcollections but it only sum the qty key, I need to sum key except the product_tmpl_id which is the criteria to group by c = defaultdict(float) for d in list_: c[d['product_tmpl_id']] += d['qty'] c = [{'product

collections.Counter: most_common INCLUDING equal counts

房东的猫 提交于 2019-12-01 21:12:06
In collections.Counter , the method most_common(n) returns only the n most frequent items in a list. I need exactly that but I need to include the equal counts as well. from collections import Counter test = Counter(["A","A","A","B","B","C","C","D","D","E","F","G","H"]) -->Counter({'A': 3, 'C': 2, 'B': 2, 'D': 2, 'E': 1, 'G': 1, 'F': 1, 'H': 1}) test.most_common(2) -->[('A', 3), ('C', 2) I would need [('A', 3), ('B', 2), ('C', 2), ('D', 2)] since they have the same count as n=2 for this case. My real data is on DNA code and could be quite large. I need it to be somewhat efficient. You can do

Sort Counter by frequency, then alphabetically in Python

浪尽此生 提交于 2019-12-01 01:17:59
I am trying to use counter to sort letters by occurrence, and put any that have the same frequency into alphabetical order, but I can't get access to the Value of the dictionary that it produces. letter_count = collections.Counter("alphabet") print(letter_count) produces: Counter({'a': 2, 'l': 1, 't': 1, 'p': 1, 'h': 1, 'e': 1, 'b': 1}) How can I get it ordered by frequency, then by alphabetical order, so everything that shows up only once is in alphabetical order? It sounds like your question is how to sort the entire list by frequency, then break ties alphabetically. You can sort the entire

Is collections.defaultdict thread-safe?

Deadly 提交于 2019-11-29 05:46:44
I have not worked with threading in Python at all and asking this question as a complete stranger. I am wondering if defaultdict is thread-safe. Let me explain it: I have d = defaultdict(list) which creates a list for missing keys by default. Let's say I have multiple threads started doing this at the same time: d['key'].append('value') At the end, I'm supposed to end up with ['value', 'value'] . However, if the defaultdict is not thread-safe, if the thread 1 yields to thread 2 after checking if 'key' in dict and before d['key'] = default_factory() , it will cause interleaving, and the other

pandas.DataFrame.from_dict not preserving order using OrderedDict

可紊 提交于 2019-11-29 05:33:34
I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right. from collections import OrderedDict from lxml import etree import requests import pandas as pd # CBS URLs base_url = 'http://opendata.cbs.nl/ODataFeed/odata' datasets = ['/37296ned', '/82245NED'] feed = requests.get(base_url + datasets[1] + '/TypedDataSet') root = etree.fromstring(feed.content) # all record entries start at

Sort Counter by frequency, then alphabetically in Python

陌路散爱 提交于 2019-11-28 01:52:46
问题 I am trying to use counter to sort letters by occurrence, and put any that have the same frequency into alphabetical order, but I can't get access to the Value of the dictionary that it produces. letter_count = collections.Counter("alphabet") print(letter_count) produces: Counter({'a': 2, 'l': 1, 't': 1, 'p': 1, 'h': 1, 'e': 1, 'b': 1}) How can I get it ordered by frequency, then by alphabetical order, so everything that shows up only once is in alphabetical order? 回答1: It sounds like your