问题
This is a continuation from my question earlier: Dplyr select_ and starts_with on multiple values in a variable list
I am collecting data from differnt sensors in various locations, data output is something like:
df<-data.frame(date=c(2011,2012,2013,2014,2015),"Sensor1 Temp"=c(15,18,15,14,19),"Sensor1 Pressure"=c(1001, 1000, 1002, 1004, 1000),"Sensor1a Temp"=c(15,18,15,14,19),"Sensor1a Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 Temp"=c(15,18,15,14,19),"Sensor2 Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 DewPoint"=c(10,11,10,9,12),"Sensor2 Humidity"=c(90, 100, 90, 100, 80))
The problem is (I think) similar to: Using select_ and starts_with R or select columns based on multiple strings with dplyr
I want to search for sensors for example by location so I have a list to search through the dataframe and also include the timestamp. But searching falls apart when I search for more than one sensor (or type of sensor etc). Is there a way of using dplyr (NSE or SE) to achieve this?
FindLocation = c("date", "Sensor1", "Sensor2")
df %>% select(matches(paste(FindLocation, collapse="|"))) # works but picks up "Sensor1a" and "DewPoint" and "Humidity" data from Sensor2
Also I want to add mixed searches such as:
FindLocation = c("Sensor1", "Sensor2") # without selecting "Sensor1a"
FindSensor = c("Temp", "Pressure") # without selecting "DewPoint" or "Humidity"
I am hoping the select combines FindSensor with FindLocation and selects Temp and Pressure data for Sensor1 and Sensor2 (without selecting Sensor1a). Returning the dataframe with the data and the columns headings:
date, Sensor1 Temp, Sensor1 Pressure, Sensor2 Temp, Sensor2 Pressure
Many thanks again!
回答1:
Some functions from purrr
are going to be useful. First, you use cross2
to compute the cartesian product of FindLocation
and FindSensor
. You'll get a list of pairs. Then you use map_chr
to apply paste
to them, joining the location and sensor strings with a dot (.
). Then you use the one_of
helper to select the colums.
library(purrr)
FindLocation = c("Sensor1", "Sensor2")
FindSensor = c("Temp", "Pressure")
columns = cross2(FindLocation, FindSensor) %>%
map_chr(paste, collapse = ".")
df %>% select(one_of(columns))
回答2:
We can use
df %>%
select(matches(paste(c("date", outer(FindLocation,
FindSensor, paste, sep=".")), collapse="|")))
回答3:
What about something like:
library(tidyverse)
wich_col <- df %>% names %>% strsplit("[.]") %>% map_lgl(function(x)x[1]%in%FindLocation&x[2]%in%FindSensor)
df[wich_col]
?
来源:https://stackoverflow.com/questions/45375409/dplyr-select-and-starts-with-on-multiple-values-in-a-variable-list-part-2