How to index (ingest) geo data (Geometry, GeometryCollection) as GeoShape in ElasticSearch with C#, Nest, NetTopologySuite from GeoJson file / string?

久未见 提交于 2021-01-29 15:41:36

问题


Summary

I want to to properly index (ingest) geo data (Geometry, GeometryCollection) as GeoShape in ElasticSearch using C#, Nest and NetTopologySuite (NTS) from GeoJson files or string representations.

I'm using the following stack: ElasticSearch 7.10.1 NEST 7.10.1 NetTopologySuite 2.1.0 NetTopologySuite.IO.GeoJSON 2.0.4

In my GitHub GIST you can find the two sample files (postal-area.geojson and the geojson file as a sample for Scenario #7) along with the code presented bellow with what i've tried so far.


My attempts

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Reflection.Metadata;
using System.Text;
using Bogus.DataSets;
using Elasticsearch.Net;
using ElasticSearch;
using GeoAPI.Geometries;
using Microsoft.Extensions.Configuration;
using Nest;
using Nest.JsonNetSerializer;
using NetTopologySuite.Features;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using NetTopologySuite.IO;
using NetTopologySuite.Geometries;
using NetTopologySuite.IO.Converters;
using Newtonsoft.Json.Converters;
using Coordinate = NetTopologySuite.Geometries.Coordinate;
using GeometryCollection = NetTopologySuite.Geometries.GeometryCollection;

private static void Main()
{
    try {
        var defaultIndex = "my_shapes";

        string cloudId = "cloudId";
        string username = "username";
        string password = "password";
        var credentials = new BasicAuthenticationCredentials(username, password);
        //var pool = new SingleNodeConnectionPool(new Uri($"http://localhost:9200"));
        var pool = new CloudConnectionPool(cloudId, credentials);
        var settings = new ConnectionSettings(pool, (c, s) =>
            new JsonNetSerializer(c, s, contractJsonConverters: new JsonConverter[] 
            {
                    new AttributesTableConverter(),
                    new CoordinateConverter(),
                    new EnvelopeConverter(),
                    new FeatureConverter(),
                    new FeatureCollectionConverter(),
                    new GeometryConverter(),
                    new GeometryArrayConverter(),
                    new StringEnumConverter()
            }))
            .DefaultIndex(defaultIndex)
            .DisableDirectStreaming()
            .PrettyJson()
            .OnRequestCompleted(callDetails => {
                if (callDetails.RequestBodyInBytes != null) {
                    var json = JObject.Parse(Encoding.UTF8.GetString(callDetails.RequestBodyInBytes));

                    Console.WriteLine(
                        $"{callDetails.HttpMethod} {callDetails.Uri} \n" +
                    $"{json.ToString(Newtonsoft.Json.Formatting.Indented)}");
                }
                else {
                    Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
                }

                Console.WriteLine();

                if (callDetails.ResponseBodyInBytes != null) {
                    Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                        $"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
                    $"{new string('-', 30)}\n");
                }
                else {
                    Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                        $"{new string('-', 30)}\n");
                }
            });

        var client = new ElasticClient(settings);

        var createIndexResponse = client.Indices.Create(defaultIndex, c => c
            .Map<MyDocument>(m => m
                .Properties(p => p
                    .GeoShape(g => g
                        .Name(n => n.Geometry)
                    )
                )
            )
        );

        if (!createIndexResponse.IsValid) {
            throw new Exception($"Error creating index: {createIndexResponse.DebugInformation}");
        }

        IndexResponse indexResponse;
        MyDocument document;
        Geometry geometryPolygon;
        FeatureCollection featureCollection;

        //Working Scenario #1: Geometry from mock Polygon -------------------works!!!!!!!!!!!
        var polygon = new Polygon(new LinearRing(new [] {
            new Coordinate(0, 0),
            new Coordinate(0, 4),
            new Coordinate(4, 4),
            new Coordinate(4, 0),
            new Coordinate(0, 0)
        }));
        document = new MyDocument(1, polygon);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #1 -------------------

        //Working Scenario #2:  Geometry from FeatureCollection from real GeoJson file ------------------- works
        var geojsonFileName = @"..\..\..\_GeoDataFiles\GeoJSONs\PostalArea.geojson";
        var jsonData = File.ReadAllText(geojsonFileName);
        featureCollection = new GeoJsonReader().Read<FeatureCollection>(jsonData);
        if (featureCollection == null) return;
        var geometry = featureCollection[0].Geometry;
        document = new MyDocument(1, geometry);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #2-------------------

        //NOT Working Scenario #3: Geometry deserialized (with GeoJsonSerializer) from mock GeoJson string -------------------
        //excluded coordinates arrays for clarity
        var geoJsonPolygonStr1 = "{\"type\":\"Polygon\",\"coordinates\":[ ... ]}";
        var serializer = new NetTopologySuite.IO.GeoJsonSerializer();
        using(var stringReader = new StringReader(geoJsonPolygonStr1))
            using (var jsonReader = new JsonTextReader(stringReader))
            {
                /*Error:
                    {"Could not create an instance of type NetTopologySuite.Geometries.Geometry. 
                    Type is an interface or abstract class and cannot be instantiated. 
                    Path 'type', line 2, position 8."}*/
                geometryPolygon = serializer.Deserialize<Geometry>(jsonReader);
            }
        document = new MyDocument(1, geometryPolygon);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #3 -------------------

        //NOT Working Scenario #4: Geometry deserialized (with JsonConvert) from mock GeoJson string -------------------
        //excluded coordinates arrays for clarity
        var geoJsonPolygonStr2 = "{\"type\":\"Polygon\",\"coordinates\":[ ... ]}";
        /*Error:
            {"Could not create an instance of type NetTopologySuite.Geometries.Geometry. 
            Type is an interface or abstract class and cannot be instantiated. 
            Path 'type', line 2, position 8."}*/
        geometryPolygon = JsonConvert.DeserializeObject<Geometry>(geoJsonPolygonStr2);
        document = new MyDocument(1, geometryPolygon);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #4 -------------------

        //NOT Working Scenario #5: GeometryCollection deserialized (with JsonConvert) from mock GeoJson string -------------------
        var geoCollectionMock =
                    @"{""type"": ""geometrycollection"",
                        ""geometries"": ["
                            + geoJsonPolygonStr1 +
                            ","
                            + geoJsonPolygonStr2 +
                        @"]
                    }";
        /*Error:
            {"Could not create an instance of type NetTopologySuite.Geometries.Geometry. 
            Type is an interface or abstract class and cannot be instantiated. 
            Path 'type', line 2, position 8."}*/
        geometryPolygon = JsonConvert.DeserializeObject<Geometry>(geoCollectionMock);
        document = new MyDocument(1, geometryPolygon);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #5 -------------------
        
        //Weired Scenario #6: GeometryCollection built from multiple Geometry objects from FeatureCollection from real GeoJson file -------------------
        //Data ingested into ElasticSearch Index, BUT, polygons from GeometryCollection can't be seen on Kibana Maps as other simple Polygons can be seen 
        var geoCollectionObj = new NetTopologySuite.Geometries.GeometryCollection(new[]
                {
                    featureCollection[0].Geometry,
                    featureCollection[1].Geometry,
                    featureCollection[2].Geometry 
                });
        document = new MyDocument(1, geoCollectionObj);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #6 -------------------

        //Not working Scenario #7: Geometry from FeatureCollection from real GeoJson file - invalid Geometry -------------------
        var isValid = featureCollection[0].Geometry.IsValid;//= false
        /*Error:
            "type" : "mapper_parsing_exception",
            "reason" : "failed to parse field [geometry] of type [geo_shape]",
            "caused_by" : {
                "type" : "invalid_shape_exception",
            "reason" : "Self-intersection at or near point [-3.173,57.545]"
            }*/
        document = new MyDocument(99, featureCollection[99].Geometry);
        indexResponse = client.IndexDocument(document);
        //End of Scenario #7 -------------------


        if (!indexResponse.IsValid) {
            throw new Exception($"Error indexinf document: {indexResponse.DebugInformation}");
        }
    }
    catch (Exception ex)
    {
        Console.WriteLine($"General error: {ex}");
    }
}

public class MyDocument {
    public MyDocument(int id, Geometry geometry) {
        Id = id;
        Geometry = geometry;
    }

    public int Id { get; set; }
    
    public Geometry Geometry { get; set; }
}

This is the GeoJson file from my GitHub GIST used as a sample for Scenario #7. It seems to be valid and it's displayed on other platforms (GitHub mapbox map preview, QGIS, geojson.io)


Questions

  1. Regarding not working scenarios (#3, #4, #5) how can i deserialize a GeoJson string into a Geometry object?
  2. Regarding Scenario #6, why GeometryCollection data isn't visibile on Kibana Map like simple Geometry (Polygons)?
    • 2.1. Don't know if it's related, but when dragging and zooming on Kibana Maps i get this error in Browser JS Console:
      message: "Input data given to 'cfe5e9a5-de63-4beb-85b2-4b67ad455ae9' is not a valid GeoJSON object."_ proto _: Object
      overrideMethod @ react_devtools_backend.js:2430
      Ut.fire @ maps.chunk.1.js:31```
      
  3. What are the differences and pro / cons of ingesting ElasticSearch geoshape as multiple individual Geometry objects (polygons in individual documents) rather than one single GeometryCollection (polygons in a single document)?
    • 3.1. What about indexing (execution time and performance - even when using Bulk Index)?
    • 3.2. What about querying, searching, filtering (usability, performance, etc.)?
  4. Regarding Scenario #7, why some Geometries seems invalid?
    • 4.1. Does NTS have a MakeValid method? Or how can i fix this in C#?

Disclaimer / other notes

The code is copied and adapted from RussCam's GIST.
An initial discussion on this topic can be found in this StackOverflow Chat room.
The current code might be useful for others who are trying this and started from the old example and article from RussCam.
The code is extracted from my source files, so please bear with me if you find some typos.


来源:https://stackoverflow.com/questions/65689842/how-to-index-ingest-geo-data-geometry-geometrycollection-as-geoshape-in-ela

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!