removing particular character in a column in r

后端 未结 3 388
不思量自难忘°
不思量自难忘° 2020-12-06 07:01

I have a table called LOAN containing column named RATE in which the observations are given in percentage for example 14.49% how can i format the table so that all value in

相关标签:
3条回答
  • 2020-12-06 07:26

    This can be achieved using the mutate verb from the tidyverse package. Which in my opinion is more readable. So, to exemplify this, I create a dataset called LOAN with a focus on the RATE to mimic the problem above.

    library(tidyverse)
    LOAN <- data.frame("SN" = 1:4, "Age" = c(21,47,68,33), 
                       "Name" = c("John", "Dora", "Ali", "Marvin"),
                       "RATE" = c('16%', "24.5%", "27.81%", "22.11%"), 
                       stringsAsFactors = FALSE)
    head(LOAN)
      SN Age   Name   RATE
    1  1  21   John    16%
    2  2  47   Dora  24.5%
    3  3  68    Ali 27.81%
    4  4  33 Marvin 22.11%
    

    In what follows, mutate allows one to alter the column content, gsub does the desired substitution (of % with "") and converts the RATE column to numeric value, keeping the data cleaning flow followable.

    LOAN <- LOAN %>% mutate(RATE = as.numeric(gsub("%", "", RATE)))
    head(LOAN)
      SN Age   Name  RATE
    1  1  21   John 16.00
    2  2  47   Dora 24.50
    3  3  68    Ali 27.81
    4  4  33 Marvin 22.11
    
    0 讨论(0)
  • 2020-12-06 07:33

    LOAN$RATE <- sapply(LOAN$RATE, function(x), gsub("%", "", x))

    0 讨论(0)
  • 2020-12-06 07:49

    Items that appear to be character when printed but for which R thinks otherwise are generally factor classes objects. I'm also guessing that you are not going to be happy with the list output that strsplit will return. Try:

    gsub( "%", "", as.character(LOAN$RATE) n)
    

    Factors which are appear numeric can be a source of confusion as well:

    > factor("14.9%")
    [1] 14.9%
    Levels: 14.9%
    > as.character(factor("14.9%"))
    [1] "14.9%"
    > gsub("%", "", as.character(factor("14.9%")) )
    [1] "14.9"
    

    This is especially confusing since print.data.frame removes the quotes:

    > data.frame(z=factor("14.9%"), zz=factor(14.9))
          z   zz
    1 14.9% 14.9
    
    0 讨论(0)
提交回复
热议问题