Data cleaning of dollar values and percentage in R

前端 未结 2 652
情话喂你
情话喂你 2020-12-22 06:13

I\'ve been searching for a number of packages in R to help me in converting dollar values to nice numerical values. I don\'t seem to be able to find one (in plyr package for

2条回答
  •  粉色の甜心
    2020-12-22 07:09

    One thing that makes R different from other languages you might be used to is that it's better to do things in a "vectorized" way, to operate on a whole vector at a time rather than looping through each individual value. So your dollarToNumber function can be rewritten without the for loop:

    dollarToNumber_vectorised <- function(vector) {
      # Want the vector as character rather than factor while
      # we're doing text processing operations
      vector <- as.character(vector)
      vector <- gsub("(\\$|,)","", vector)
      # Create a numeric vector to store the results in, this will give you
      # warning messages about NA values being introduced because the " K" values
      # can't be converted directly to numeric
      result <- as.numeric(vector)
      # Find all the "$N K" values, and modify the result at those positions
      k_positions <- grep(" K", vector)
      result[k_positions] <- as.numeric(gsub(" K","", vector[k_positions])) * 1000
      # Same for the "$ M" value
      m_positions <- grep(" M", vector)
      result[m_positions] <- as.numeric(gsub(" M","", vector[m_positions])) * 1000000
      return(result)
    }
    

    It still gives the same output as your original function:

    > dollarToNumber_vectorised(allProjects$LiveDollars)
     [1] 3100000 3970000 3020000 1760000 4510000  762650  510860  823370  218590  865940
    [11]  587670  221110   71934
    # Don't worry too much about this warning
    Warning message:
    In dollarToNumber_vectorised(allProjects$LiveDollars) :
      NAs introduced by coercion
    > dollarToNumber(allProjects$LiveDollars)
     [1] 3100000 3970000 3020000 1760000 4510000  762650  510860  823370  218590  865940
    [11]  587670  221110   71934
    

提交回复
热议问题