How would you write a wrapper function or class to format numbers as percent, currency, etc. in R?

纵然是瞬间 提交于 2019-12-03 12:06:32

问题


In a previous question, I asked whether whether a convenient wrapper exists inside base R to format numbers as percentages.

This elicited three responses:

  1. Probably not.
  2. Such a wrapper would be too narrow to be useful. It is better that useRs learn how to use existing tools, such as sprintf, which can format numbers in a highly flexible way.
  3. Such a wrapper is problematic, anyway, since you lose the ability to perform calculations on the object.

Still, in my view the sprintf function is just a little bit too obfuscated for the R beginner to learn (except if they come from a C background). Perhaps a better solution is to modify format or prettyNum to have options for adding prefixes and suffixes, so you could easily create percents, currencies, degrees, etc.


Question:

How would you design a function, class or set of functions to elegantly deal with formatting numbers as percentages, currencies, degrees, etc?


回答1:


I would probably keep things very simple. format() is generally useful for most basic formatting needs. I would extend that with a simple wrapper that allowed arbitrary prefix and suffix strings. Here is a simple version:

formatVal <- function(x, prefix = "", suffix = "", sep = "", collapse = NULL,
                      ...) {
    x <- format(x, ...)
    x <- paste(prefix, x, suffix, sep = sep, collapse = collapse)
    x
}

If I were doing this for real, I would probably not have the collapse argument in the definition of formatVal(), but instead process it out of ..., but for illustration I kept the above function simple.

Using:

set.seed(1)
m <- runif(5)

some simple examples of usage

> formatVal(m*100, suffix = "%")
[1] "26.55087%" "37.21239%" "57.28534%" "90.82078%" "20.16819%"
> formatVal(m*100, suffix = "%", digits = 2)
[1] "27%" "37%" "57%" "91%" "20%"
> formatVal(m*100, suffix = "%", digits = 2, nsmall = 2)
[1] "26.55%" "37.21%" "57.29%" "90.82%" "20.17%"
> formatVal(m, prefix = "£")
[1] "£0.2655087" "£0.3721239" "£0.5728534" "£0.9082078" "£0.2016819"
> formatVal(m, prefix = "£", digits = 1)
[1] "£0.3" "£0.4" "£0.6" "£0.9" "£0.2"
> formatVal(m, prefix = "£", digits = 1, nsmall = 2)
[1] "£0.27" "£0.37" "£0.57" "£0.91" "£0.20"



回答2:


print.formatted <- function(x)
{
   print(paste(attr(x,"prefix"), sprintf(x*attr(x,"scaleFactor"),fmt=paste("%.",attr(x,"precision"),"f",sep="")), attr(x,"suffix"), sep=""))
}

as.percent <- function(x,precision=3)
{
  class(x) <- c(class(x),"formatted")
  attr(x,"scaleFactor")<-100
  attr(x,"prefix")<-""
  attr(x,"suffix")<-"%"
  attr(x,"precision")<-precision
  return(x)
}

as.currency <- function(x,prefix="£")
{
  class(x) <- c(class(x),"formatted")
  attr(x,"scaleFactor")<-1
  attr(x,"prefix")<-prefix
  attr(x,"suffix")<-""
  attr(x,"precision")<-2
  return(x)
}

as.percent(runif(3))
[1] "21.585%" "12.396%" "37.744%"

x <- as.currency(rnorm(3,500,100))
x
[1] "£381.93" "£339.49" "£521.74"
2*x
[1] "£763.86"  "£678.98"  "£1043.48"



回答3:


I think this could be done through attributes, e.g. let v <- 3.4. If it is pounds Sterling, we could use something like:

attributes(v)<-list(style = "descriptor", type = "currency", category = "pound")

If it is a percentage:

attributes(v)<-list(style = "descriptor", type = "proportion", category = "percentage")

Then, a special print method would be necessary. One could also incorporate a translation method, e.g. to convert from GBP to USD (pounds to dollars), centimeters to inches, etc.

The descriptor is essentially my view on a reserved kind of flag for indicating special handling for the given number. This could later extend to text strings, such as addresses and names. For other numbers, such as phone numbers, there may be special decompositions into country code, intra-country area/regional codes, all the way down to extensions.

Such a package may be akin to ggplot for data types - special methods for storing, transforming, and printing things within types?

Such a system might ensure that dimensions are correct when multiplying values. That has real utility in a lot of applications.

To my knowledge, the only widespread handling of units in R is for bytes (bytes, KB, MB, etc.) and time (hours, seconds, etc.). Even so, the handling, while simple, isn't obvious - I still have to tell print the units to use. For instance, If I want to print an object's size in KB, I can't simply calculate object.size(v)/1024 - the output is reported in fractions of a byte, rather than KB; I have to use print(object.size(v), units = "K").




回答4:


ggplot2 has a bunch of functions for formatting common specific cases. These would be ideal, but for two things: they aren't really general enough, and you shouldn't really have to load ggplot2 (with all it's dependencies) to get at such functions. You could try contacting Hadley to get the signatures changed to pass more things to format, and have them moved to a lower level package (plyr maybe, or their own package, ggtools?).



来源:https://stackoverflow.com/questions/7147706/how-would-you-write-a-wrapper-function-or-class-to-format-numbers-as-percent-cu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!