问题
I have a data frame that have information about crimes (variable x), and latitude and longitude of where that crime happened. I have a shape file with the districts from são paulo city. That is df:
latitude longitude n_homdol
1 -23.6 -46.6 1
2 -23.6 -46.6 1
3 -23.6 -46.6 1
4 -23.6 -46.6 1
5 -23.6 -46.6 1
6 -23.6 -46.6 1
And a shape file for the districts of são paulo,sp.dist.sf :
geometry NOME_DIST
1 POLYGON ((352436.9 7394174,... JOSE BONIFACIO
2 POLYGON ((320696.6 7383620,... JD SAO LUIS
3 POLYGON ((349461.3 7397765,... ARTUR ALVIM
4 POLYGON ((320731.1 7400615,... JAGUARA
5 POLYGON ((338651 7392203, 3... VILA PRUDENTE
6 POLYGON ((320606.2 7394439,... JAGUARE
With the help of @Humpelstielzchen, i join both data doing:
sf_df = st_as_sf(df, coords = c("longitude", "latitude"), crs = 4326)
shape_df<-st_join(sp.dist.sf, sf_df, join=st_contains)
My final goal is to implement a local moran i statistic, and i'm trying to do this with:
sp_viz <- poly2nb(shape_df, row.names = shape_df$NOME_DIST)
xy <- st_coordinates(shape_df)
ww <- nb2listw(sp_viz, style ='W', zero.policy = TRUE)
shape_df[is.na(shape_df)] <- 0
locMoran <- localmoran(shape_df$n_homdol, ww)
sids.shade <- auto.shading(c(locMoran[,1],-locMoran[,1]),
cols=brewer.pal(5,"PRGn"))
choropleth(shape_df, locMoran[,1], shading=sids.shade)
choro.legend(-46.5, -20, sids.shade,fmt="%6.2f")
title("Criminalidade (Local Moran's I)",cex.main=2)
But when i run the code, it takes hours to compute:
sp_viz <- poly2nb(shape_df, row.names = shape_df$NOME_DIST)
I have 15,000 observations, for 93 districts. I tried to run the above code with only 100 observations, and it was fast and everything went right. But with the 15,000 obs i did not see the result, because de computation goes on forever. What may be happening? I am doing something wrong? Is there a better way to do this Local moran I test?
回答1:
As I can't just comment, here is some questions one might ask:
- how long do you mean by fast? some of my scripts run in seconds and I call it slow.
- are all your observation identically structured? maybe the poly2nb()
function is infinitely looping on an item which has an uncommon structure. You can use the unique()
function to ensure this point.
- Did you try to cut your dataset into pieces and to run each piece separately? this would help to see 1/ whether one of your parts has something to be corrected and 2/ whether R is loading all data at the same time, overloading the memory of your computer. Beware, this happen really often with huge datasets in R (and by huge, I mean data tables of > 50 Mo wheight).
Glad to have tried to help you, do not hesitate to question my answer !
来源:https://stackoverflow.com/questions/56077248/poly2nb-function-takes-too-much-time-to-be-computed