How would I be able to remove opposite values (e.g. refunds) in panel data?

我是研究僧i 提交于 2020-01-25 08:00:07

问题


Given the following data:

id|datee      | price | quant | discrete_x
 1 2018-12-19      4    -3000   A
 1 2018-12-04      4     3000   A
 1 2018-12-21      4     3000   B
 1 2018-12-20      3     2000   A
...

Desired output:

id|datee      | price | quant | discrete_x
 1 2018-12-21      4     3000   B
 1 2018-12-20      3     2000   A
...

In this case, it is quite clear that the quantity (quant) of 3000 is refunded, then bought again. I would like to remove the two rows for cancelling each other out. Given that id and quant match while the refund happens once and after a purchase of matching number of quant, how would I be able to remove all of them for each id value?

I've been considering (but stuck on) two ideas so far: 1) Within an arranged group_by values, check the later dates within a column to see if quant would match as opposite values 2) For loop within a for loop

I feel that for loop within a for loop is better, but not sure how I would match on discrete_x.

How would your approach be? Would you use for loop within a for loop?


回答1:


Hope this solution will work for your problem.

df <- abs(df$quant)
df1 <- df[!duplicated(df[c("id","quant")]),]

assuming your data frame name is df.




回答2:


This is a very ugly implementation, but I think this might work. We can create a filtering column after grouping by id and arranging by date.

library(dplyr)
library(tidyr)

df %>%
  group_by(id) %>%
  arrange(datee) %>%
  mutate(f = lead(quant) + quant == 0,
         f = ifelse(f, f, lag(f)),
         f = tidyr::replace_na(f, FALSE)) %>%
  filter(!f) %>%
  select(-f)

#> # A tibble: 2 x 6
#> # Groups:   id [1]
#>      id datee      price quant discrete_x    
#>   <dbl> <date>     <dbl> <dbl> <chr>
#> 1     1 2018-12-20     3  2000 A
#> 2     1 2018-12-21     4  3000 B


来源:https://stackoverflow.com/questions/59753716/how-would-i-be-able-to-remove-opposite-values-e-g-refunds-in-panel-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!