问题
I am using the survey
package in R to work with the U.S. Census' PUMS dataset for population. I've created a Boolean for each broad industry and a character variable MigrationStatus
with three values (Stayed
,Left
,Entered
). I'd like to examine the ratios of workers in each industry by migration status.
This works fine:
AGR_ratio=svyby(~JobAGR, by=~MigrationStatus, denominator=~EmployedAtWork, design=subset(pums_design,EmployedAtWork==1), svyratio, vartype='ci')
But this produces an error:
varlist=names(pums_design$variables)[32:50]
job_ratios = lapply(varlist, function(x) {
svyby(substitute(~i, list(i = as.name(x))), by=~MigrationStatus, denominator=~EmployedAtWork, design=subset(pums_design,EmployedAtWork==1), svyratio, vartype='ci')
})
#Error in svyby.default(substitute(~i, list(i = as.name(x))), by = ~MigrationStatus, :
#object 'byfactor' not found
varlist
#[1] "JobADM" "JobAGR" "JobCON" "JobEDU" "JobENT" "JobEXT" "JobFIN" "JobINF" "JobMED" "JobMFG" "JobMIL" "JobPRF" "JobRET" "JobSCA" "JobSRV"
#[16] "JobTRN" "JobUNE" "JobUTL" "JobWHL"
回答1:
how about this?
# setup
library(survey)
data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
# single example
svyby(~api99, by = ~stype, denominator = ~api00 , dclus1, svyratio)
# multiple example
variables <- c( "api99" , "pcttest" )
# breaks
lapply(variables, function(x) svyby(substitute(~i, list(i = as.name(x))), by=~stype, denominator=~api00, design=dclus1, svyratio, vartype='ci'))
# works
lapply( variables , function( z ) svyby( as.formula( paste0( "~" , z ) ) , by = ~stype, denominator = ~api00 , dclus1, svyratio , vartype = 'ci' ) )
btw you might be interested in this uspums data automation syntax
来源:https://stackoverflow.com/questions/22796372/combining-lapply-svyby-svyratio-to-calculate-many-ratios-with-confidence-inter