How to subset a list based on the length of its elements in R

问题

In R I have a function (coordinates from the package sp ) which looks up 11 fields of data for each IP addresss you supply.

I have a list of IP's called ip.addresses:

> head(ip.addresses)
[1] "128.177.90.11"  "71.179.12.143"  "66.31.55.111"   "98.204.243.187" "67.231.207.9"   "67.61.248.12"

Note: Those or any other IP's can be used to reproduce this problem.

So I apply the function to that object with sapply:

ips.info     <- sapply(ip.addresses, ip2coordinates)

and get a list called ips.info as my result. This is all good and fine, but I can't do much more with a list, so I need to convert it to a dataframe. The problem is that not all IP addresses are in the databases thus some list elements only have 1 field and I get this error:

> ips.df       <- as.data.frame(ips.info)
Error in data.frame(`128.177.90.10` = list(ip.address = "128.177.90.10",  :

arguments imply differing number of rows: 1, 0

My question is -- "How do I remove the elements with missing/incomplete data or otherwise convert this list into a data frame with 11 columns and 1 row per IP address?"

I have tried several things.

First, I tried to write a loop that removes elements with less than a length of 11

for (i in 1:length(ips.info)){
if (length(ips.info[i]) < 11){
ips.info[i] <- NULL}}

This leaves some records with no data and makes others say "NULL", but even those with "NULL" are not detected by is.null

Next, I tried the same thing with double square brackets and get
```
Error in ips.info[[i]] : subscript out of bounds
```

I also tried complete.cases() to see if it could potentially be useful

Error in complete.cases(ips.info) : not all arguments have the same length

Finally, I tried a variation of my for loop which was conditioned on length(ips.info[[i]] == 11 and wrote complete records to another object, but somehow it results in an exact copy of ips.info

回答1:

Here's one way you can accomplish this using the built-in Filter function

#input data
library(RDSTK)
ip.addresses<-c("128.177.90.10","71.179.13.143","66.31.55.111","98.204.243.188",
    "67.231.207.8","67.61.248.15")
ips.info  <- sapply(ip.addresses, ip2coordinates)

#data.frame creation
lengthIs <- function(n) function(x) length(x)==n
do.call(rbind, Filter(lengthIs(11), ips.info))

or if you prefer not to use a helper function

do.call(rbind, Filter(function(x) length(x)==11, ips.info))

回答2:

Alternative solution based on base package.

  # find non-complete elements
  ids.to.remove <- sapply(ips.info, function(i) length(i) < 11)
  # remove found elements
  ips.info <- ips.info[!ids.to.remove]
  # create data.frame
  df <- do.call(rbind, ips.info)

来源：https://stackoverflow.com/questions/25022511/how-to-subset-a-list-based-on-the-length-of-its-elements-in-r

标签

list

subset