问题
I have a dataframe like this
id <-c("1","2","3")
col <- c("CHB_len_SCM_max","CHB_brf_SCM_min","CHB_PROC_S_SV_mean")
df <- data.frame(id,col)
I want to create 2 columns by separating the "col" into the measurement and stat. stat is basically the text after the last underscore (max,min,mean, etc)
My desired output is
id Measurement stat
1 CHB_len_SCM max
2 CHB_brf_SCM min
3 CHB_PROC_S_SV mean
I tried it this way but the stat column in empty. I am not sure if I am pointing to the last underscore.
library(tidyverse)
df1 <- df %>%
# Separate the sensors and the summary statistic
separate(col, into = c("Measurement", "stat"),sep = '\\_[^\\_]*$')
What am I missing here? Can someone point me in the right direction?
回答1:
We could use extract
by capturing as two groups by making sure that the second group have one or more characters that are not a _
until the end ($
) of the string
library(tidyverse)
df %>%
extract(col, into = c("Measurement", "stat"), "(.*)_([^_]+)$")
# id Measurement stat
#1 1 CHB_len_SCM max
#2 2 CHB_brf_SCM min
#3 3 CHB_PROC_S_SV mean
Or using separate
with a regex lookaround
df %>%
separate(col, into = c("Measurement", "stat"), sep="_(?=[^_]+$)")
来源:https://stackoverflow.com/questions/50518137/separate-a-column-into-2-columns-at-the-last-underscore-in-r