问题
I have searched through the archives and to no avail on this problem I have involving the subsetting of 2 related data frames, one data frame is a key, the other is an annual list, I'd like to use the key to create a subset and an index. I have tried using the subset formula's but my code is not appropriately meeting my criteria. Here is the data:
players <- c('Albert Belle','Reggie Jackson', 'Reggie Jackson')
contract_start_season <- c(1999,1977,1982)
contract_end_season <- c(2003, 1981, 1985)
key <- data.frame (player = players, contract_start_season, contract_end_season)
player_data <- data.frame( season = c(seq(1975,1985),seq(1997,2003)), player = c(rep('Reggie Jackson',times=11),rep('Albert Belle', times=7)))
I want to use the key to subset the player data to those years, so for Jackson 1977 to 1981 and then 1982 to 1985 and for Albert Belle 1999 to 2003. I'd also like to create an index so for example Reggie Jackson 1977 would be year 1, 1978 year 2 ect...
The code I have tried without merging looks like this and it isn't working:
player_data[player_data$season >= key$contract_start_season&player_data$season <= key$contract_end_season,]
I am also running into problems when merging because Reggie Jackson has 2 different contract years and it is trying to merge both.
Any help or advice on this would be super appreciated.
回答1:
Are you trying to do something along the following lines?
library(data.table)
key <- data.table(key)
player_data <- data.table(player_data)
#Adding another column called season to help in the merge later
key[,season := contract_start_season]
# Index on which to merge
setkeyv(key, c("player","season"))
setkeyv(player_data, c("player","season"))
#the roll = Inf makes it like a closest merge, instead of an exact merge
key[player_data, roll = Inf]
Output:
> key[player_data, roll = Inf]
player season contract_start_season contract_end_season
1: Albert Belle 1997 NA NA
2: Albert Belle 1998 NA NA
3: Albert Belle 1999 1999 2003
4: Albert Belle 2000 1999 2003
5: Albert Belle 2001 1999 2003
6: Albert Belle 2002 1999 2003
7: Albert Belle 2003 1999 2003
8: Reggie Jackson 1975 NA NA
9: Reggie Jackson 1976 NA NA
10: Reggie Jackson 1977 1977 1981
11: Reggie Jackson 1978 1977 1981
12: Reggie Jackson 1979 1977 1981
13: Reggie Jackson 1980 1977 1981
14: Reggie Jackson 1981 1977 1981
15: Reggie Jackson 1982 1982 1985
16: Reggie Jackson 1983 1982 1985
17: Reggie Jackson 1984 1982 1985
18: Reggie Jackson 1985 1982 1985
来源:https://stackoverflow.com/questions/19468378/subsetting-and-merging-from-2-related-data-frames-in-r