How to extract certain rows

喜夏-厌秋 提交于 2019-12-11 08:13:16

问题


So As you can see I have a price and Day columns below

 Price  Day
    2   1
    5   2
    8   3
    11  4
    14  5
    17  6
    20  7
    23  8
    26  9
    29  10
    32  11
    35  12
    38  13
    41  14
    44  15
    47  16
    50  17
    53  18
    56  19
    59  20

I then want the output below

  Difference    Day
    12  5
    15  10
    15  15
    15  20

So now I have the difference in prices every 5 days...it just basically subtracts the 5th day with the first day.....and then the 10th day with the 5th day etc.... I already made a code that will seperate my data into 5 day intervals...but I want the code that will let me minus the 5th with the 1st day....the 10th day with the 5th day...etc So the code should look something like this

difference<-tapply(Price[,1],Day, ____________)

So basically Price[,1] will be my Price data.....while "Day" is the variable that I created that will let me seperate my Day data into 5 day intervals.....I'm thinking that in the blank section I could put in the function or another variable that will let me subtract the 5th day with the 1st day prices and then the 10th day and 5th day prices...etc.....you dont have to help me to seperate my Days into intervals...just how to do "difference" section....thanks guys


回答1:


Here's one option, assuming your data.frame is called "SODF":

within(SODF[c(1, seq(5, nrow(SODF), 5)), ], { 
  Price <- diff(c(0, Price)) 
})[-1, ]
#    Price Day
# 5     12   5
# 10    15  10
# 15    15  15
# 20    15  20

The first step is basic subsetting. According to your description and expected answer, you want the first row, and then every fifth row starting from row 5:

> SODF[c(1, seq(5, nrow(SODF), 5)), ]
   Price Day
1      2   1
5     14   5
10    29  10
15    44  15
20    59  20

From there, you can use diff on the "Price" column, but since diff will result in a vector that is one in length shorter than your input, you need to "pad" the input vector, which I did with diff(c(0, Price)).

# Correct values, but the number of rows needs to be 5
> diff(SODF[c(1, seq(5, nrow(SODF), 5)), "Price"])
[1] 12 15 15 15

Then, the [-1, ] at the end just deletes the extraneous row.

Update

In the comments below, @geektrader points out in the comments (thanks!), an alternative to using:

SODF[c(1, seq(5, nrow(SODF), 5)), ]

as your input data.frame, you may consider using the following instead:

rbind(SODF[1,], SODF[$Day %% 5 == 0,] )

The difference in the two approaches is that the first approach simply subsets by row number, while the second approach subsets according to the value in the "Day" column, extracting rows where "Day" is a multiple of 5. This second approach might be useful, for instance, when there are missing rows in the dataset.




回答2:


Ananda's is a nice approach (always forget about within myself). Here's another approach:

dat2 <- dat[seq(0, nrow(dat), by=5), ]
data.frame(Difference=diff(c(dat[1,1], dat2[, 1])), Day=dat2[, 2])



回答3:


Here a solution if you have a matrix as input.

The subsequent function, given a matrix m, a column col_id and a numeric interval interv, subtracts every interv rows the current value in the col_id column of the m matrix with the previous value (5 rows before, same column, obiviously).

The results are stored in a new column called diff and appended to the end of the m matrix.

In short, the approach is very similar to that used by @Ananda Mahto.

So, this is the function:

subtract_column <- function(m, col_id, interv) {
  select <- c(1, seq(interv, nrow(m), interv))
  cbind(m[select[-1], ], diff = diff(m[select, col_id]))
}

Example:

# this emulates your data as a matrix
price_vect <- c(2,5,8,11,14,17,20,23,26,29,32,35,38,41,44,47,50,53,56,59)
day_vect <- 1:20
matr <- do.call(cbind, list(price = price_vect, day = day_vect))
# and this calls the function above and does the job:
# subtracts every 5 rows the current and the previous (5 rows back) value in the column `price` of matrix `matr`
subtract_column(matr, 'price', 5)

Output:

     price day diff
[1,]    14   5   12
[2,]    29  10   15
[3,]    44  15   15
[4,]    59  20   15


来源:https://stackoverflow.com/questions/15286287/how-to-extract-certain-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!