R: How to get a dataset with blanks in its name

问题

How can one get an R dataset with blanks in its name, such as 'BJsales.lead (BJsales)' in package "datasets" ?

pkg = "datasets"
cat( "Summary of all the datasets in package", pkg, "--\n"  )
d = data( package=pkg ) $results  # "Package" "LibPath" "Item" "Title"
names = d[ , "Item" ]
titles = d[ , "Title" ]
    # sum( duplicated( names )) ??

for( j in 1:len(names) ){
    name = names[[j]]
    cat( name, ":\n" )

    data( list=name )
    x = get( name )  # <-- Error if blank in name

    m = paste( dim( as.matrix( x )), collapse=" " )  # grr
    cat( class(x), m, " freq", frequency(x), "\n" )
}

# -> Error in get(name) : object 'BJsales.lead (BJsales)' not found

OK, get can only lookup valid names, that's reasonable.
But what to do -- how can one get the data for 'BJsales.lead (BJsales)' ?

R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
Running under: OS X 10.8.3 (Mountain Lion)

回答1:

In fact, get() can look up "invalid" names:

`x y` <- 3;
get('x y');
## [1] 3

The issue here is that the Item column of the results matrix returned by data() does not always contain the exact name of the data set; in some cases, it has a parenthetical suffix, although I've no idea why.

You can strip it off with gsub(), and then loading via get() should work.

Also, you shouldn't need the data(list=name) call.

Also, there's no len() (unfortunately); I think you mean length().

Hence:

pkg <- 'datasets';
cat('Summary of all the datasets in package',pkg,'--\n');

d <- data(package=pkg)$results; # 'Package' 'LibPath' 'Item' 'Title'
names <- d[,'Item'];
titles <- d[,'Title'];

for (j in 1:length(names)) {
    name <- names[j];
    cat(name,':\n');
    x <- get(gsub('\\s.*','',name));
    m <- paste(dim(as.matrix(x)),collapse=' ');
    cat(class(x),m,' freq',frequency(x),'\n');
};
## Summary of all the datasets in package datasets --
## AirPassengers :
## ts 144 1  freq 12
## BJsales :
## ts 150 1  freq 1
## BJsales.lead (BJsales) :
## ts 150 1  freq 1
## BOD :
## data.frame 6 2  freq 1
## CO2 :
## nfnGroupedData nfGroupedData groupedData data.frame 84 5  freq 1
## ChickWeight :
## nfnGroupedData nfGroupedData groupedData data.frame 578 4  freq 1
## DNase :
## nfnGroupedData nfGroupedData groupedData data.frame 176 3  freq 1
## EuStockMarkets :
## mts ts matrix 1860 4  freq 260
## Formaldehyde :
## data.frame 6 2  freq 1
## HairEyeColor :
## table 32 1  freq 1
## Harman23.cor :
## list 3 1  freq 1
## Harman74.cor :
## list 3 1  freq 1
## Indometh :
## nfnGroupedData nfGroupedData groupedData data.frame 66 3  freq 1
## InsectSprays :
## data.frame 72 2  freq 1
## JohnsonJohnson :
## ts 84 1  freq 4
## LakeHuron :
## ts 98 1  freq 1
## LifeCycleSavings :
## data.frame 50 5  freq 1
## Loblolly :
## nfnGroupedData nfGroupedData groupedData data.frame 84 3  freq 1
## Nile :
## ts 100 1  freq 1
## Orange :
## nfnGroupedData nfGroupedData groupedData data.frame 35 3  freq 1
## OrchardSprays :
## data.frame 64 4  freq 1
## PlantGrowth :
## data.frame 30 2  freq 1
## Puromycin :
## data.frame 23 3  freq 1
## Seatbelts :
## mts ts 192 8  freq 12
## Theoph :
## nfnGroupedData nfGroupedData groupedData data.frame 132 5  freq 1
## Titanic :
## table 32 1  freq 1
## ToothGrowth :
## data.frame 60 3  freq 1
## UCBAdmissions :
## table 24 1  freq 1
## UKDriverDeaths :
## ts 192 1  freq 12
## UKgas :
## ts 108 1  freq 4
## USAccDeaths :
## ts 72 1  freq 12
## USArrests :
## data.frame 50 4  freq 1
## USJudgeRatings :
## data.frame 43 12  freq 1
## USPersonalExpenditure :
## matrix 5 5  freq 1
## VADeaths :
## matrix 5 4  freq 1
## WWWusage :
## ts 100 1  freq 1
## WorldPhones :
## matrix 7 7  freq 1
## ability.cov :
## list 3 1  freq 1
## airmiles :
## ts 24 1  freq 1
## airquality :
## data.frame 153 6  freq 1
## anscombe :
## data.frame 11 8  freq 1
## attenu :
## data.frame 182 5  freq 1
## attitude :
## data.frame 30 7  freq 1
## austres :
## ts 89 1  freq 4
## beaver1 (beavers) :
## data.frame 114 4  freq 1
## beaver2 (beavers) :
## data.frame 100 4  freq 1
## cars :
## data.frame 50 2  freq 1
## chickwts :
## data.frame 71 2  freq 1
## co2 :
## ts 468 1  freq 12
## crimtab :
## table 42 22  freq 1
## discoveries :
## ts 100 1  freq 1
## esoph :
## data.frame 88 5  freq 1
## euro :
## numeric 11 1  freq 1
## euro.cross (euro) :
## matrix 11 11  freq 1
## eurodist :
## dist 21 21  freq 1
## faithful :
## data.frame 272 2  freq 1
## fdeaths (UKLungDeaths) :
## ts 72 1  freq 12
## freeny :
## data.frame 39 5  freq 1
## freeny.x (freeny) :
## matrix 39 4  freq 1
## freeny.y (freeny) :
## ts 39 1  freq 4
## infert :
## data.frame 248 8  freq 1
## iris :
## data.frame 150 5  freq 1
## iris3 :
## array 600 1  freq 1
## islands :
## numeric 48 1  freq 1
## ldeaths (UKLungDeaths) :
## ts 72 1  freq 12
## lh :
## ts 48 1  freq 1
## longley :
## data.frame 16 7  freq 1
## lynx :
## ts 114 1  freq 1
## mdeaths (UKLungDeaths) :
## ts 72 1  freq 12
## morley :
## data.frame 100 3  freq 1
## mtcars :
## data.frame 32 11  freq 1
## nhtemp :
## ts 60 1  freq 1
## nottem :
## ts 240 1  freq 12
## npk :
## data.frame 24 5  freq 1
## occupationalStatus :
## table 8 8  freq 1
## precip :
## numeric 70 1  freq 1
## presidents :
## ts 120 1  freq 4
## pressure :
## data.frame 19 2  freq 1
## quakes :
## data.frame 1000 5  freq 1
## randu :
## data.frame 400 3  freq 1
## rivers :
## numeric 141 1  freq 1
## rock :
## data.frame 48 4  freq 1
## sleep :
## data.frame 20 3  freq 1
## stack.loss (stackloss) :
## numeric 21 1  freq 1
## stack.x (stackloss) :
## matrix 21 3  freq 1
## stackloss :
## data.frame 21 4  freq 1
## state.abb (state) :
## character 50 1  freq 1
## state.area (state) :
## numeric 50 1  freq 1
## state.center (state) :
## list 2 1  freq 1
## state.division (state) :
## factor 50 1  freq 1
## state.name (state) :
## character 50 1  freq 1
## state.region (state) :
## factor 50 1  freq 1
## state.x77 (state) :
## matrix 50 8  freq 1
## sunspot.month :
## ts 3177 1  freq 12
## sunspot.year :
## ts 289 1  freq 1
## sunspots :
## ts 2820 1  freq 12
## swiss :
## data.frame 47 6  freq 1
## treering :
## ts 7980 1  freq 1
## trees :
## data.frame 31 3  freq 1
## uspop :
## ts 19 1  freq 0.1
## volcano :
## matrix 87 61  freq 1
## warpbreaks :
## data.frame 54 3  freq 1
## women :
## data.frame 15 2  freq 1

来源：https://stackoverflow.com/questions/30684322/r-how-to-get-a-dataset-with-blanks-in-its-name

标签

dataset

names