I have a table called LOAN containing column named RATE in which the observations are given in percentage for example 14.49% how can i format the table so that all value in
This can be achieved using the mutate verb from the tidyverse package. Which in my opinion is more readable.
So, to exemplify this, I create a dataset called LOAN with a focus on the RATE to mimic the problem above.
library(tidyverse)
LOAN <- data.frame("SN" = 1:4, "Age" = c(21,47,68,33),
"Name" = c("John", "Dora", "Ali", "Marvin"),
"RATE" = c('16%', "24.5%", "27.81%", "22.11%"),
stringsAsFactors = FALSE)
head(LOAN)
SN Age Name RATE
1 1 21 John 16%
2 2 47 Dora 24.5%
3 3 68 Ali 27.81%
4 4 33 Marvin 22.11%
In what follows, mutate allows one to alter the column content, gsub does the desired substitution (of % with "") and converts the RATE column to numeric value, keeping the data cleaning flow followable.
LOAN <- LOAN %>% mutate(RATE = as.numeric(gsub("%", "", RATE)))
head(LOAN)
SN Age Name RATE
1 1 21 John 16.00
2 2 47 Dora 24.50
3 3 68 Ali 27.81
4 4 33 Marvin 22.11
LOAN$RATE <- sapply(LOAN$RATE, function(x), gsub("%", "", x))
Items that appear to be character when printed but for which R thinks otherwise are generally factor classes objects. I'm also guessing that you are not going to be happy with the list output that strsplit will return. Try:
gsub( "%", "", as.character(LOAN$RATE) n)
Factors which are appear numeric can be a source of confusion as well:
> factor("14.9%")
[1] 14.9%
Levels: 14.9%
> as.character(factor("14.9%"))
[1] "14.9%"
> gsub("%", "", as.character(factor("14.9%")) )
[1] "14.9"
This is especially confusing since print.data.frame removes the quotes:
> data.frame(z=factor("14.9%"), zz=factor(14.9))
z zz
1 14.9% 14.9