After importing a file, I always try try to remove spaces from the column names to make referral to column names easier.
Is there a better way to do this other then
It's often convenient to change the names of your columns within one chunk of dplyr code rather than renaming the columns after you've created the data frame. Piping in rename_all() is very useful in these situations:
ctm2 %>% rename_all(function(x) gsub(" ", "_", x))
The code above will replace all spaces in every column name with an underscore.
There is an easy way to remove spaces in column names in data.table. You will have to convert your data frame to data table.
setnames(x=DT, old=names(DT), new=gsub(" ","",names(DT)))
Country Code will be converted to CountryCode
To replace only the first space in each column you could also do:
names(ctm2) <- sub(" ", ".", names(ctm2))
or to replace all spaces (which seems like it would be a little more useful):
names(ctm2) <- gsub(" ", "_", names(ctm2))
or, as mentioned in the first answer (though not in a way that would fix all spaces):
spaceless <- function(x) {colnames(x) <- gsub(" ", "_", colnames(x));x}
newDF <- spaceless(ctm2)
where x is the name of your data.frame. I prefer to use "_" to avoid issues with "." as part of an ID.
The point is that gsub doesn't stop at the first instance of a pattern match.
Assign the names like this. This works best. It replaces all white spaces in the name with underscore.
names(ctm2)<-gsub("\\s","_",names(ctm2))
There exists more elegant and general solution for that purpose:
tidy.name.vector <- make.names(name.vector, unique=TRUE)
make.names() makes syntactically valid names out of character vectors. A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number.
Additionally, flag unique=TRUE allows you to avoid possible dublicates in new column names.
As code to implement
d<-read_delim(urltxt,delim='\t',)
names(d)<-make.names(names(d),unique = TRUE)
There is a very useful package for that, called janitor that makes cleaning up column names very simple. It removes all unique characters and replaces spaces with _.
library(janitor)
#can be done by simply
ctm2 <- clean_names(ctm2)
#or piping through `dplyr`
ctm2 <- ctm2 %>%
clean_names()