Python : List of dict, if exists increment a dict value, if not append a new dict

后端 未结 6 774
悲哀的现实
悲哀的现实 2020-12-04 05:27

I would like do something like that.

list_of_urls = [\'http://www.google.fr/\', \'http://www.google.fr/\', 
                \'http://www.google.cn/\', \'http         


        
6条回答
  •  离开以前
    2020-12-04 05:54

    To do it exactly your way? You could use the for...else structure

    for url in list_of_urls:
        for url_dict in urls:
            if url_dict['url'] == url:
                url_dict['nbr'] += 1
                break
        else:
            urls.append(dict(url=url, nbr=1))
    

    But it is quite inelegant. Do you really have to store the visited urls as a LIST? If you sort it as a dict, indexed by url string, for example, it would be way cleaner:

    urls = {'http://www.google.fr/': dict(url='http://www.google.fr/', nbr=1)}
    
    for url in list_of_urls:
        if url in urls:
            urls[url]['nbr'] += 1
        else:
            urls[url] = dict(url=url, nbr=1)
    

    A few things to note in that second example:

    • see how using a dict for urls removes the need for going through the whole urls list when testing for one single url. This approach will be faster.
    • Using dict( ) instead of braces makes your code shorter
    • using list_of_urls, urls and url as variable names make the code quite hard to parse. It's better to find something clearer, such as urls_to_visit, urls_already_visited and current_url. I know, it's longer. But it's clearer.

    And of course I'm assuming that dict(url='http://www.google.fr', nbr=1) is a simplification of your own data structure, because otherwise, urls could simply be:

    urls = {'http://www.google.fr':1}
    
    for url in list_of_urls:
        if url in urls:
            urls[url] += 1
        else:
            urls[url] = 1
    

    Which can get very elegant with the defaultdict stance:

    urls = collections.defaultdict(int)
    for url in list_of_urls:
        urls[url] += 1
    

提交回复
热议问题