Parsing osm.pbf data using GDAL/OGR python module

断了今生、忘了曾经 提交于 2019-12-07 12:59:32

问题


I'm trying to extract data from an OSM.PBF file using the python GDAL/OGR module.

Currently my code looks like this:

import gdal, ogr

osm = ogr.Open('file.osm.pbf')

## Select multipolygon from the layer
layer = osm.GetLayer(3) 
# Create list to store pubs
pubs = []
for feat in layer:
    if feat.GetField('amenity') == 'pub':
         pubs.append(feat)

While this little bit of code works fine with small.pbf files (15mb). However, when parsing files larger than 50mb I get the following error:

 ERROR 1: Too many features have accumulated in points layer. Use OGR_INTERLEAVED_READING=YES MODE

When I turn this mode on with:

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')

ogr does not return any features at all anymore, even when parsing small files.

Does anyone know what is going on here?


回答1:


Thanks to scai's answer I was able to figure it out.

The special reading pattern required for interleaved reading that is mentioned in gdal.org/1.11/ogr/drv_osm.html is translated into a working python example that can be found below.

This is an example of how to extract all features in an .osm.pbf file that have the 'amenity=pub' tag

import gdal, ogr

gdal.SetConfigOption('OGR_INTERLEAVED_READING', 'YES')
osm = ogr.Open('file.osm.pbf')

# Grab available layers in file
nLayerCount = osm.GetLayerCount()

thereIsDataInLayer = True

pubs = []

while thereIsDataInLayer:

    thereIsDataInLayer = False

    # Cycle through available layers
    for iLayer in xrange(nLayerCount):

        lyr=osm.GetLayer(iLayer)

        # Get first feature from layer
        feat = lyr.GetNextFeature()

        while (feat is not None):

             thereIsDataInLayer = True

             #Do something with feature, in this case store them in a list
             if feat.GetField('amenity') == 'pub':
                 pubs.append(feat)

             #The destroy method is necessary for interleaved reading
             feat.Destroy()

             feat = lyr.GetNextFeature()

As far as I understand it, a while-loop is needed instead of a for-loop because when using the interleaved reading method, it is impossible to obtain the featurecount of a collection.

More clarification on why this piece of code works like it does would be greatly appreciated.



来源:https://stackoverflow.com/questions/35439205/parsing-osm-pbf-data-using-gdal-ogr-python-module

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!