Removing duplicates from a list of lists

前端 未结 12 1439
萌比男神i
萌比男神i 2020-11-22 10:37

I have a list of lists in Python:

k = [[1, 2], [4], [5, 6, 2], [1, 2], [3], [4]]

And I want to remove duplicate elements from it. Was if it

12条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 10:57

    Even your "long" list is pretty short. Also, did you choose them to match the actual data? Performance will vary with what these data actually look like. For example, you have a short list repeated over and over to make a longer list. This means that the quadratic solution is linear in your benchmarks, but not in reality.

    For actually-large lists, the set code is your best bet—it's linear (although space-hungry). The sort and groupby methods are O(n log n) and the loop in method is obviously quadratic, so you know how these will scale as n gets really big. If this is the real size of the data you are analyzing, then who cares? It's tiny.

    Incidentally, I'm seeing a noticeable speedup if I don't form an intermediate list to make the set, that is to say if I replace

    kt = [tuple(i) for i in k]
    skt = set(kt)
    

    with

    skt = set(tuple(i) for i in k)
    

    The real solution may depend on more information: Are you sure that a list of lists is really the representation you need?

提交回复
热议问题