readOGR (rgdal) fails to fetch polygon names from XML

限于喜欢 提交于 2019-12-06 13:21:04

OK, if anyone encounters the same problem, here is the solution I found.

The website provides the maps in two formats: KML and SHP. I chose KML, because this was used in a worked example that I was following. But there appears to be a problem with this particular KML file or how it was generated. I tried the procedure with a Shapefile (SHP) instead, and it worked like a charm.

Shapefiles can be read into R by the same function, but don't need specifying the layer:

ccg_boundaries <- ReadOGR("Clinical_Commissioning_Groups_April_2016_Ultra_Generalised_Clipped_Boundaries_in_England.SHP")

CCG names are now there in the ccg16nm variable:

> head(ccg_boundaries@data)
  objectid   ccg16cd                                 ccg16nm st_areasha st_lengths
0        1 E38000001 NHS Airedale, Wharfedale and Craven CCG 1224636590  193149.74
1        2 E38000002                         NHS Ashford CCG  582174805  122841.19
2        3 E38000003                  NHS Aylesbury Vale CCG  984352696  229544.11
3        4 E38000004            NHS Barking and Dagenham CCG   36315011   31196.87
4        5 E38000005                          NHS Barnet CCG   86654018   41833.69
5        6 E38000006                        NHS Barnsley CCG  327520495  106476.52

Your issue is that windows does not have the necessary library to extract the ExtendedData from a KML.
I provided a working solution here: https://stackoverflow.com/a/51657844/2763996

The solution to your problem is the following function that will work on your example KML:

library(tidyverse)
library(xml2)
library(rgdal)

readKML <- function(file,keep_name_description=FALSE,layer,...) {
  # Set keep_name_description = TRUE to keep "Name" and "Description" columns
  #   in the resulting SpatialPolygonsDataFrame. Only works when there is
  #   ExtendedData in the kml file.

  sp_obj<-readOGR(file,layer,...)
  xml1<-read_xml(file)
  if (!missing(layer)) {
    different_layers <- xml_find_all(xml1, ".//d1:Folder") 
    layer_names <- different_layers %>% 
      xml_find_first(".//d1:name") %>% 
      xml_contents() %>% 
      xml_text()

    selected_layer <- layer_names==layer
    if (!any(selected_layer)) stop("Layer does not exist.")
    xml2 <- different_layers[selected_layer]
  } else {
    xml2 <- xml1
  }

  # extract name and type of variables

  variable_names1 <- 
    xml_find_first(xml2, ".//d1:ExtendedData") %>% 
    xml_children() 

  while(variable_names1 %>% 
        xml_attr("name") %>% 
        is.na() %>% 
        any()&variable_names1 %>%
        xml_children() %>% 
        length>0) variable_names1 <- variable_names1 %>%
    xml_children()

  variable_names <- variable_names1 %>%
    xml_attr("name") %>% 
    unique()

  # return sp_obj if no ExtendedData is present
  if (is.null(variable_names)) return(sp_obj)

  data1 <- xml_find_all(xml2, ".//d1:ExtendedData") %>% 
    xml_children()

  while(data1 %>%
        xml_children() %>% 
        length>0) data1 <- data1 %>%
    xml_children()

  data <- data1 %>% 
    xml_text() %>% 
    matrix(.,ncol=length(variable_names),byrow = TRUE) %>% 
    as.data.frame()

  colnames(data) <- variable_names

  if (keep_name_description) {
    sp_obj@data <- data
  } else {
    try(sp_obj@data <- cbind(sp_obj@data,data),silent=TRUE)
  }
  sp_obj
}
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!