Sorting consecutive pairs of items in a python list

﹥>﹥吖頭↗ 提交于 2020-01-24 02:08:15

问题


The data that I have is actually contained in pandas dataframe (on a column) but for the sake of this post, we extract it to get to the nub of the problem.

Suppose we have a dataframe df with a column col1 which we store as a list: L = df.col1.tolist(). Now, I have about 2000 of these columns/lists and on average they have a length of about 300-400. So there is no massive need for performance here.

Back to our MWE list, it is structured with items like this (ish):

L = [1,2,2,1,3,3,4,4,5,5,6,6,1,2,1,2,7,7,8,8]

Now the way the items in the list should be structured is that of consecutive pairs (but for data-collection reasons, they are not). So here is the sorted list we are aiming for:

L = [1,1,2,2,3,3,4,4,5,5,6,6,1,1,2,2,7,7,8,8]

I have added these as tuples just for clarity:

L = [(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(1,1),(2,2),(7,7),(8,8)]

This the problem: the columns contain almost sequential pairs of items (the numbers in the above example) but some of them are out of order and have to be moved back to their partner (see above).

A few things to observe:

  • The above list contains numbers, in actuality, we are dealing with strings
  • The data typically lives on a column in a pandas dataframe (not sure if this helps but it may)
  • Performance is not really a problem since they will only need to be sorted once
  • The out-of-order pattern is not consistent and things move around a lot in each column, what is important is that each item is mapped back to its partner.

I am looking for a method that can sort these lists/columns into the required pair-sequential order. Thanks!


回答1:


OK, since you can guarantee that they are always paired, I'd just keep a running count and you basically just need to generate a list of the elements in the order that the first item in the pair is encountered (so when the count is equal to zero), and when the count gets to 2, reset the count for that item. Then just "explode" this list of the first elements in order into a list of the pairs, so quick and dirty:

In [1]: L = [1,2,2,1,3,3,4,4,5,5,6,6,1,2,1,2,7,7,8,8]

In [2]: from collections import Counter

In [3]: counts = Counter()

In [4]: order = []

In [5]: for x in L:
   ...:     n = counts[x]
   ...:     if n == 0:
   ...:         order.append(x)
   ...:         counts[x] += 1
   ...:     elif n == 2:
   ...:         counts[x] = 0
   ...:     else:
   ...:         counts[x] += 1
   ...:

In [6]: order
Out[6]: [1, 2, 3, 4, 5, 6, 1, 2, 7, 8]

In [7]: result = []

In [8]: for x in order:
   ...:     result.append(x)
   ...:     result.append(x)
   ...:

In [9]: result
Out[9]: [1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 1, 1, 2, 2, 7, 7, 8, 8]

Of course, you should make a function to do this.



来源:https://stackoverflow.com/questions/58077677/sorting-consecutive-pairs-of-items-in-a-python-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!