问题
i need some help. I'm currently trying to fit a linear model to hourly electricity prices. So, I was thinking of creating a dummy, which takes the value 1, if the hour of the day is between 06:00 and 20:00. Unfortunately, I have struggled so far.
time.cet <- as.POSIXct(time.numeric, origin = "1970-01-01", tz=local.time.zone)
hours.S <- strftime(time.cet, format = "%H:%M:%S", tz=local.time.zone)
head(time.cet)
[1] "2007-01-01 00:00:00 CET" "2007-01-01 01:00:00 CET" "2007-01-01 02:00:00 CET"
[4] "2007-01-01 03:00:00 CET" "2007-01-01 04:00:00 CET" "2007-01-01 05:00:00 CET"
I, hope someone can help.
回答1:
When I do time cutoffs I like to make the cutoffs as objects. This way, if you need to change the cutoffs, it's much easier to change the object's value instead of the value in the conditional statements.
My code below uses lubridate(), which is a great package for managing time/dates.
My code below should give you the info you need to incorporate a dummy variable into your analysis.
###
### Load Package
###
library(lubridate)
###
### Designate Time Cut-Offs
###
Beginning <- hms("06:00:00")
End <- hms("20:00:00")
###
### Designate Test Cut-Offs
###
Test.1 <- hms("5:00:00")
Test.2 <- hms("11:00:00")
###
### Test Conditional Logic
###
### Value will be 1 if time is between, value will be 0 if it is not.
###
ifelse( ((Test.1 >= Beginning) & (Test.1 <= End)) , 1, 0)
########## This should (and does) return a 0
ifelse( ((Test.2 >= Beginning) & (Test.2 <= End)) , 1, 0)
####### This should (and does) return a 1
###
### Create New Variable On Previous Data Frame (Your.DF) named Time.Dummy
###
### Value for new variable will be 1 if time is between, value will be 0 if it is not.
###
Your.DF$Time.Dummy <- ifelse( ((time.cet >= Beginning) & (time.cet <= End)) , 1, 0)
回答2:
ifelse()
statements are a convenient way to create a dummy variable. I don't know much about working with time personally, but creating a dummy variable would take a form similar to:
dummy <- with(data, ifelse(time > 06:00 & time < 20:00, 1, 0)
Where data is whatever your data is called, and time is the column that your times are stored in. You may need to play around with the conditions a little bit if the times don't behave like normal numeric vectors (which I assume for this purpose they will).
回答3:
library(lubridate)
# Create fake data
set.seed(2)
dat = data.frame(time = seq(ymd_hms("2016-01-01 00:00:00"), ymd_hms("2016-01-31 00:00:00"), by="hour"))
dat$price = 1 + cumsum(rnorm(nrow(dat), 0, 0.01))
# Create time dummy
dat$dummy = ifelse(hour(dat$time) >=6 & hour(dat$time) <= 20, 1, 0)
回答4:
Try to include reproducible code next time. Looks like you're missing time.numeric
for instance.
Okay, I had to make up some random times.
time.cet <- c( ymd_hms( "2007-01-01 00:00:00" ),
ymd_hms( "2007-01-01 06:00:00" ),
ymd_hms( "2007-01-01 12:00:00" ) )
time.cet
[1] "2006-12-31 18:00:00 CST" "2007-01-01 00:00:00 CST" "2007-01-01 06:00:00 CST"
Note a time zone issue, which is unimportant to the solution.
You can use dplyr::between
and lubridate::hour
to get a list of TRUE/FALSE (or 1/0) for whether X
time is between A & B
.
library(dplyr)
library(lubridate)
A <- 6
B <- 20
between( hour(time.cet), A, B )
[1] TRUE FALSE TRUE
Note that between
is inclusive >=
& <=
来源:https://stackoverflow.com/questions/44466780/creating-a-dummy-variable-for-certain-hours-of-the-day