R Conditional Regression with Multiple Conditions

一笑奈何 提交于 2019-12-13 07:21:41

问题


I am trying to run a regression in R based on two conditions. My data has binary variables for both year and another classification. I can get the regression to run properly while only using 1 condition:

# now time for the millions of OLS
# format: OLSABCD where ABCD are binary for the values of MSA/UA and years
# A = 1 if MSA, 0 if UA
# B = 1 if 2010
# C = 1 if 2000
# D = 1 if 1990

OLS1000<-summary(lm(lnrank ~ lnpop, data = subset(df, msa==1)))
OLS1000

However I cannot figure out how to get both the MSA/UA classification to work with the year variables as well. I have tried:

OLS1100<-summary(lm(lnrank ~ lnpop, data = subset(df, msa==1, df$2010==1)))
OLS1100

But it returns the error:

Error: unexpected numeric constant in "OLS1100<-summary(lm(lnrank ~ lnpop,   
data = subset(df, msa==1, df$2010"

How can I get the program to run utilizing both conditions?

Thank you again!


回答1:


The problem is:

df$2010

If your data really has a column named 2010, then you need backticks around it:

df$`2010`

And in your subset, don't specify df twice:

subset(df, msa == 1, `2010` == 1)

In general it's better if column names don't start with digits. It's also best not to name data frames df, since that's a function name.




回答2:


@neilfws pointed out the "numeric as column names issue", but there is actually another issue in your code.

The third argument of subset() is actually reserved for the select =, which lets you choose which columns to include (or exclude). So the correct syntax should be:

subset(df, msa == 1 & `2010` == 1)

instead of

subset(df, msa == 1, `2010` == 1)

This second code would not give you an error, but it also would not give you the right condition.



来源:https://stackoverflow.com/questions/42499760/r-conditional-regression-with-multiple-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!