问题
I have a dataframe in R that contains the following columns structure (in a bigger scale):
Material_code actual_202009 actual_202010 actual_202011 pred_202009 pred_202010 pred_202011
111 30 44 24 25 52 27
112 19 70 93 23 68 100
I would like to add new columns to the dataframe containing the respective error measure:
|actual - pred|/ actual * 100%
Obtaining this:
Material_code actual_202009 actual_202010 actual_202011 pred_202009 pred_202010 pred_202011 MAPE_202009 MAPE_202010 MAPE_202011
111 30 44 24 25 52 27 16.67% 18.18% 12.5%
112 19 70 93 23 68 100 21.05% 2.86% 7.52%
I tried to create the new columns using ends_with()
to select the previuous, but I am not getting it right. Can you please help?
*** EDIT to include easier way to generate the dataframe
df <- data.frame(Material_code = c(111,112),
actual_202009 = c(30,19),
actual_202010 = c(44,70),
actual_202011 = c(24,93),
pred_202009 = c(25,23),
pred_202010 = c(52,68),
pred_202011 = c(27,100))
回答1:
Get the column names of all 'actual'
and 'pred'
columns and you can perform all the mathematical calculations on them directly.
actual_cols <- sort(grep('actual', names(df), value = TRUE))
pred_cols <- sort(grep('pred', names(df), value = TRUE))
new_cols <- sub('pred', 'MAPE', pred_cols)
df[new_cols] <- abs(df[actual_cols] - df[pred_cols])/df[actual_cols] * 100
df
# Material_code actual_202009 actual_202010 actual_202011 pred_202009
#1 111 30 44 24 25
#2 112 19 70 93 23
# pred_202010 pred_202011 MAPE_202009 MAPE_202010 MAPE_202011
#1 52 27 16.7 18.18 12.50
#2 68 100 21.1 2.86 7.53
data
df <- structure(list(Material_code = 111:112, actual_202009 = c(30L,
19L), actual_202010 = c(44L, 70L), actual_202011 = c(24L, 93L
), pred_202009 = c(25L, 23L), pred_202010 = c(52L, 68L), pred_202011 = c(27L,
100L)), class = "data.frame", row.names = c(NA, -2L))
回答2:
A bit more verbose from the tidyverse:
library(tidyverse)
df %>%
pivot_longer(cols = -Material_code) %>%
separate(name, into = c("type", "time"), sep = "_") %>%
pivot_wider(names_from = type) %>%
mutate(MAPE = abs(actual - pred)/actual*100) %>%
pivot_wider(values_from = c(actual, pred, MAPE),
names_from = time)
gives:
# A tibble: 2 x 10
Material_code actual_202009 actual_202010 actual_202011 pred_202009 pred_202010 pred_202011 MAPE_202009 MAPE_202010 MAPE_202011
<int> <int> <int> <int> <int> <int> <int> <dbl> <dbl> <dbl>
1 111 30 44 24 25 52 27 16.7 18.2 12.5
2 112 19 70 93 23 68 100 21.1 2.86 7.53
回答3:
You will help yourself a lot if you try to keep your data in long format: each column has the same kind of data. Your table is in wide format, very useful for excel and human visualization, but very cumbersome to deal with in code.
So the first thing you need to do (that's what @deschen did in their answer) is converting your data to long, and then operate on it. A long version of your data will be of the form
Material_code Type Date Value
111 actual 202011 30
I will provide a data.table solution, that is basically the same as @deschen's. You may like this one for its speed on large data.
library(data.table)
setDT(df1)
df1[, melt(.SD, 1)][,
c("type", "date") := tstrsplit(variable, "_", fixed = TRUE)][,
dcast(.SD, Material_code + date ~ type)][,
mape := 100 * abs(actual - pred) / actual][]
melt(.SD, 1)
converts your table from wide to long, keeping only the first column as reference for each record.c("type", "date") := tstrsplit(variable, "_", fixed = TRUE)
creates columns type and date with the corresponding values taken fromvariable
(aftermelt
ing,variable
has the former column names).dcast(.SD, Material_code + date ~ type)
converts the long table into wide again. This time,Material_code
anddate
will be kept in columns, andtype
will be casted into new columnsactual
andpred
.- The
:=
is an assignment operator. It creates variablemape
and assigns the resulting value. - The last bit,
[]
isn't actually needed. Is there so the result is printed to screen. If you don't need to print the new table to screen, omit it.
来源:https://stackoverflow.com/questions/65935389/how-to-calculate-a-formula-that-takes-different-columns-of-a-dataframe-with-the