What does a formula with no left argument mean in R, e.g. ~x?

问题

I understand that in a formula like y ~ x I am looking at "y" as a function of "x". In maths this would be something like f(x) = x.

In R, functions like xtabs can take formula objects without a left side, e.g. xtabs( ~ x). From my understanding of formulas, I am now looking at nothing as a function of "x", in maths = x, but that is obviously not how R understands the formula (it returns a contingency table of a factor, for example).

So how can I understand the meaning of an empty left hand argument?

I'm sure this has been explained somewhere, but I have a hard time googling for "R ~".

回答1:

Formulas only have meaning in the context of the particular functions that work with them. The same formula may mean something completely different to one function vs. another function.

In the case of xtabs it sums the left hand side over the levels of the right hand side and if there is no left hand side it gives the counts. That is, the default left hand side can be regarded as a vector of ones. e.g. these each give the same result

x <- c(1, 1, 2, 2, 2)

# 1
xtabs(~ x)

# 2
ones <- rep(1, length = length(x))
xtabs(ones ~ x)

This also gives a similar result but in this case the result is an array rather than a table:

# 3
tapply(ones, x, sum)

回答2:

The use of a formula is not strongly wired in R; while there are tools for easier parsing of formula, for example to create contrast, it is up to the package author to do something useful with what comes out of the parsing.

You will often find ~x without the left side in context with counts, e.g. in lattice barplots or histograms. Often, you can think of the empty left side as "the count of".

回答3:

In the meantime I have learned the following and would like to add it to the answers already given:

A two-sided formula, such as in plot(y ~ x) or lm(y ~ x), is a symbolic representation of an asymmetrical question regarding the dependence between (groups of) dependent and independent variables. Dependent variables stand on the left side of the formula and you can read the formula as "(left side) as a function of (right side)".

A one-sided formula, such as in xtabs(~ x + y) or cor.test(~ x + y) is a symbolic representation of a symmetrical question regarding the correlation (in the broad everyday sense) between two "equal" variables (e.g. both dependent, both independent, or of unknown dependence).

Feel free to correct my bad English.

来源：https://stackoverflow.com/questions/16477612/what-does-a-formula-with-no-left-argument-mean-in-r-e-g-x

标签

formula