How to extract numeric ranges from 2 columns containig numeric sequences and print the range from both columns (different increment values)?

ε祈祈猫儿з 提交于 2019-12-20 03:32:14

问题


I'm curently learnig python and pandas (this question is based on a pevious post but with an additional query); at the moment have the 2 columns containing numeric sequences (ascending and/or descending) as described below:

Col 1: (col1 numeric incrememt and/or decrement = 1)

    1 
    2
    3
    5
    7
    8
    9

Col 2: (Col2 numeric increment and/or decrement = 4)

 113
 109
 105
 90
 94
 98
 102

Need to extract the numeric ranges from both columns and print them according to the sequence break occurance on any of those 2 columns and the result should be as follow:

 1,3,105,113
 5,5,90,90
 7,9,94,102

Already received a very useful way to do it using python's pandas library by @MaxU where it generates the numeric ranges based on the breaks detected on both columns using a criteria of col1 and col2 = increase and/or decreases by 1.

How can I extract numeric ranges from 2 columns and print the range from both columns as tuples?

The unique difference on this case is that the increment/decrement criteria applied for both columns are different for each one of them.


回答1:


Try this:

In [42]: df
Out[42]:
   Col1  Col2
0     1   113
1     2   109
2     3   105
3     5    90
4     7    94
5     8    98
6     9   102

In [43]: df.groupby(df.diff().abs().ne([1,4]).any(1).cumsum()).agg(['min','max'])
Out[43]:
  Col1     Col2
   min max  min  max
1    1   3  105  113
2    5   5   90   90
3    7   9   94  102

Explanation: our goal is to group those rows with the increment/decrement [1,4] for Col1, Col2 correspondingly:

In [44]: df.diff().abs()
Out[44]:
   Col1  Col2
0   NaN   NaN
1   1.0   4.0
2   1.0   4.0
3   2.0  15.0
4   2.0   4.0
5   1.0   4.0
6   1.0   4.0

In [45]: df.diff().abs().ne([1,4])
Out[45]:
    Col1   Col2
0   True   True
1  False  False
2  False  False
3   True   True
4   True  False
5  False  False
6  False  False

In [46]: df.diff().abs().ne([1,4]).any(1)
Out[46]:
0     True
1    False
2    False
3     True
4     True
5    False
6    False
dtype: bool

In [47]: df.diff().abs().ne([1,4]).any(1).cumsum()
Out[47]:
0    1
1    1
2    1
3    2
4    3
5    3
6    3
dtype: int32


来源:https://stackoverflow.com/questions/40607709/how-to-extract-numeric-ranges-from-2-columns-containig-numeric-sequences-and-pri

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!