问题
I'm trying to achieve a simple string comparison across two columns. Sample of (mocked up) data:
EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015
The logic I want to use is:
If From_DeptCode = To_DeptCode
then ChangeType="No Change"
ElseIf From_DeptCode != To_DeptCode AND TransactionType = "Reorg"
then ChangeType="Reorg"
Else ChangeType="Transfer"
So my output would look like:
EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012,Transfer
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013,No Change
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014,No Change
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011,Reorg
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010,Transfer
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015,No Change
Here's what I know so far:
transfers <- read.csv(file="Transfers.csv", head=TRUE,
sep=",",colClasses=c(NA,NA,NA,NA,NA,NA,NA,"Date",NA))
at this point, I would, I assume, implement my logic:
If From_DeptCode = To_DeptCode
then ChangeType="No Change"
ElseIf From_DeptCode != To_DeptCode AND TransactionType = "Reorg"
then ChangeType="Reorg"
Else ChangeType="Transfer"
I assume that here I'd write out my new csv write.csv(transfers, file = "transfersprocessed.csv", row.names = FALSE)
Any advice on getting the rest of the way there?
Update:
Per answer from @josilber, I ran the following code:
transfers <- read.csv(file="Transfers.csv", head=TRUE, sep=",", colClasses=c(NA,NA,NA,NA,NA,NA,NA,"Date",NA))
dat$ChangeType <- ifelse(dat$From_DeptCode == dat$To_DeptCode, "No Change",ifelse(dat$TransactionType == "Reorg", "Reorg", "Transfer"))
View(transfers)
On the following data:
EMPLID,From_DeptCode,FromDept,To_DeptCode,To_Dept,TransactionTypeCode,TransactionType,EffectiveDate,ChangeType
0239583290,21,Sales,43,CustomerService,10,Promotion,12/12/2012
1230495829,21,Sales,21,Sales,10,Promotion,9/1/2013
4059503918,93,Operations,93,Operations,10,Demotion,11/18/2014
3040593021,19,Headquarters,23,International,11,Reorg,12/13/2011
7029406920,15,Marketing,84,Development,19,Reassignment,01/05/2010
2039052819,19,Headquarters,19,Headquarters,10,Promotion,4/15/2015
And the ChangeType variable remained "NA".
Is the nested ifelse statement syntax correct? Any idea why the ChangeType isn't working?
回答1:
You can do this with a nested ifelse
statement:
dat$ChangeType <- ifelse(dat$From_DeptCode == dat$To_DeptCode, "No Change",
ifelse(dat$TransactionType == "Reorg", "Reorg", "Transfer"))
dat
# EMPLID From_DeptCode FromDept To_DeptCode To_Dept TransactionTypeCode
# 1 239583290 21 Sales 43 CustomerService 10
# 2 1230495829 21 Sales 21 Sales 10
# 3 4059503918 93 Operations 93 Operations 10
# 4 3040593021 19 Headquarters 23 International 11
# 5 7029406920 15 Marketing 84 Development 19
# 6 2039052819 19 Headquarters 19 Headquarters 10
# TransactionType EffectiveDate ChangeType
# 1 Promotion 12/12/2012 Transfer
# 2 Promotion 9/1/2013 No Change
# 3 Demotion 11/18/2014 No Change
# 4 Reorg 12/13/2011 Reorg
# 5 Reassignment 01/05/2010 Transfer
# 6 Promotion 4/15/2015 No Change
The ifelse
is passed a vector of TRUE/FALSE values as its first argument, using the second argument for the TRUE cases and using the third argument for the FALSE cases. For your false cases you actually want to run another ifelse
, which is why the logic is nested here.
Note that for large data frames this will be a good deal quicker than looping through your data and doing the nested if statement one row at a time.
来源:https://stackoverflow.com/questions/30243992/using-r-to-process-csv-to-evaluate-if-cola-colb-with-consideration-for-col