Fetching esJsonRDD from elasticsearch with complex filtering in Spark

大城市里の小女人 提交于 2019-12-12 10:23:19

问题


I am currently fetching the elasticsearch RDD in our Spark Job filtering based on one-line elastic query as such (example):

val elasticRdds = sparkContext.esJsonRDD(esIndex, s"?default_operator=AND&q=director.name:DAVID + \n movie.name:SEVEN")

Now if our search query becomes complex like:

{
    "query": {
        "filtered": {
            "query": {
                "query_string": {
                    "default_operator": "AND",
                    "query": "director.name:DAVID + \n movie.name:SEVEN"
                }
            },
            "filter": {
                "nested": {
                    "path": "movieStatus.boxoffice.status",
                    "query": {
                        "bool": {
                            "must": [
                                {
                                    "match": {
                                        "movieStatus.boxoffice.status.rating": "A"
                                    }
                                },
                                {
                                    "match": {
                                        "movieStatus.boxoffice.status.oscar": "false"
                                    }
                                }
                            ]
                        }
                    }
                }
           }
        }
    }
}

Can I still convert that query to in-line elastic query to use it with esJsonRDD? Or is there anyway that the above query could still be used as is with esJsonRDD? If not, what is the better way to fetch such RDDs in Spark?

Because esJsonRDD seems to accept only inline(one line) elastic queries.


回答1:


Use triple quotes:

val query = """{
"query": {
    "filtered": {
        "query": {
            "query_string": {
                "default_operator": "AND",
                "query": "director.name:DAVID + \n movie.name:SEVEN"
            }
        },
        "filter": {
            "nested": {
                "path": "movieStatus.boxoffice.status",
                "query": {
                    "bool": {
                        "must": [
                            {
                                "match": {
                                    "movieStatus.boxoffice.status.rating": "A"
                                }
                            },
                            {
                                "match": {
                                    "movieStatus.boxoffice.status.oscar": "false"
                                }
                            }
                        ]
                    }
                }
            }
        }
     }
  }
}"""

val elasticRdds = sparkContext.esJsonRDD(esIndex, query)


来源:https://stackoverflow.com/questions/44527299/fetching-esjsonrdd-from-elasticsearch-with-complex-filtering-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!