The data I\'m trying to convert is supposed to be a date, however it is formatted as mmddyyyy with no separation by dashes or slashes. In order to work with dates in R, I wo
Updated: Improved with @Richard Scriven's colClasses and simpler as.Date() suggestions
Here are two similar methods that worked for me, going from a csv containing mmddyyyy format date, to getting it recognized by R as a date object.
Starting first with a simple file tv.csv:
Series,FirstAir
Quantico,09272015
Muppets,09222015
Once within R,
> t = read.csv('tv.csv', colClasses = 'character')
tv.csv as a data frame named tcolClasses = 'character') option causes all the data to be considered the character data type (instead of being Factor, int types)Examine its initial structure:
> str(t)
'data.frame': 2 obs. of 2 variables:
$ Series : chr "Quantico" "Muppets"
$ FirstAir: chr "09272015" "09222015"
chrThe chr or string of characters are then easily converted into a date:
> t$FirstAir = as.Date(t$FirstAir, "%m%d%Y")
as.Date() performs string to date conversion%m%d%Y specifies how to interpret the input in t$FirstAir. These format codes, at least on Linux, can be found with running $ man date which brings up the manual on the date program, where there is a list of formatting codes. For example it says %m month (01..12)If for some reason you don't want a blanket import conversion to all characters, for example a file with many variables and wish to leave R's auto type recognition in use but merely "fix" the one date variable, follow this method.
Once within R,
> t = read.csv('tv.csv')
tv.csv as a data frame named tExamine its initial structure:
> str(t)
'data.frame': 2 obs. of 2 variables:
$ Series : Factor w/ 2 levels "Muppets","Quantico": 2 1
$ FirstAir: int 9272015 9222015
>
FirstAir variable R has imported 09272015 as int meaning integer, and dropped off the leading zero padding , the 0 in 09 is important later for date conversion yet R has imported it without. So we need to fix this.This can be done in a single command but for clarity I have broken this into two steps. First,
> t$FirstAir = sprintf("%08d", t$FirstAir)
sprintf is a formatting function0 means pad with zeroes8 means ensure 8 characters, because mmddyyyy is total 8 charactersd is used when the input is a number, which currently it is, recall str() output claimed the t$FirstAir is an int meaning integert$FirstAir is the variable we are both setting and using as inputCheck the result:
> str(t$FirstAir)
chr [1:2] "09272015" "09222015"
int to a chr type, for example 9272015 became "09272015"Now it is a string or chr type we can then convert, same as method 1.
> t$FirstAir = as.Date(strptime(t$FirstAir, "%m%d%Y"))
We do a final check:
> str(t$FirstAir)
Date[1:2], format: "2015-09-27" "2015-09-22"
In both cases, what were original values in a text file are have now been successfully converted into R date objects.
Have a look at lubridate mdy function
require(lubridate)
a <- "10281994"
mdy(a)
gives you
[1] "1994-10-28 UTC"
of class "POSIXct" "POSIXt" so a datetime in R. (thanks Joshua Ulrich for the correction)
You could use as.Date(mdy(a)) = 1994-10-28 to get a Object of class Date.
There are mutations like ymd and dmy within lubridate as well.