elasticsearch-py

Elasticsearch analyze() not compatible with Spark in Python?

久未见 提交于 2019-12-25 06:59:32
问题 I'm using the elasticsearch-py client within PySpark using Python 3 and I'm running into a problem using the analyze() function with ES in conjunction with an RDD. In particular, each record in my RDD is a string of text and I'm trying to analyze it to get out the token information, but I'm getting an error when trying to use it within a map function in Spark. For example, this works perfectly fine: from elasticsearch import Elasticsearch es = Elasticsearch() t = 'the quick brown fox' es

Elasticsearch is not sorting the results

风流意气都作罢 提交于 2019-12-13 12:24:04
问题 I'm having problem with an elasticsearch query. I want to be able to sort the results but elasticsearch is ignoring the sort tag. Here my query: { "sort": [{ "title": {"order": "desc"} }], "query":{ "term": { "title": "pagos" } } } However, when I remove the query part and I send only the sort tag, it works. Can anyone point me out the correct way? I also tried with the following query, which is the complete query that I have: { "sort": [{ "title": {"order": "asc"} }], "query":{ "bool":{

Elasticsearch delete_by_query wrong usage

蹲街弑〆低调 提交于 2019-12-12 01:24:24
问题 I am using 2 similar ES methods to load and delete documents: result = es.search(index='users_favourite_documents', doc_type='favourite_document', body={"query": {"match": {'user': user}}}) And: result = es.delete_by_query(index='users_favourite_documents', doc_type='favourite_document', body={"query": {"match": {'user': user}}}) First one works ok and returns expected records. Second one throws Exception: "TransportError(404,'{ \"found\":false, \"_index\":\"users_favourite_documents\", \"

Version conflict when using the delete method of elasticsearch-dsl

做~自己de王妃 提交于 2019-12-11 06:47:32
问题 So, we're using elasticsearch in our Django project, and we're using the elasticsearch-dsl python library. We got the following error in production: ConflictError(409, '{"took":7,"timed_out":false,"total":1,"deleted":0,"batches":1,"version_conflicts":1,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_until_millis":0,"failures":[{"index":"events","type":"_doc","id":"KJ7SpWsBZnen1jNBRWWM","cause":{"type":"version_conflict_engine_exception",

elasticsearch python client - work with many nodes - how to work with sniffer

大城市里の小女人 提交于 2019-12-10 23:21:03
问题 i have one cluster with 2 nodes. i am trying to understand the best practise to connect the nodes, and check failover when there is downtime on one node. from documentation: es = Elasticsearch( ['esnode1', 'esnode2'], # sniff before doing anything sniff_on_start=True, # refresh nodes after a node fails to respond sniff_on_connection_fail=True, # and also every 60 seconds sniffer_timeout=60 ) so i tried to connect to my nodes like this: client = Elasticsearch([ip1, ip2],sniff_on_start=True,

Elasticsearch is not sorting the results

安稳与你 提交于 2019-12-04 22:53:23
I'm having problem with an elasticsearch query. I want to be able to sort the results but elasticsearch is ignoring the sort tag. Here my query: { "sort": [{ "title": {"order": "desc"} }], "query":{ "term": { "title": "pagos" } } } However, when I remove the query part and I send only the sort tag, it works. Can anyone point me out the correct way? I also tried with the following query, which is the complete query that I have: { "sort": [{ "title": {"order": "asc"} }], "query":{ "bool":{ "should":[ { "match":{ "title":{ "query":"Pagos", "boost":9 } } }, { "match":{ "description":{ "query":

python elasticsearch client set mappings during create index

随声附和 提交于 2019-11-29 22:56:39
I can set mappings of index being created in curl command like this: { "mappings":{ "logs_june":{ "_timestamp":{ "enabled":"true" }, "properties":{ "logdate":{ "type":"date", "format":"dd/MM/yyy HH:mm:ss" } } } } } But I need to create that index with elasticsearch client in python and set mappings.. what is the way ? I tried somethings below but not work: self.elastic_con = Elasticsearch([host], verify_certs=True) self.elastic_con.indices.create(index="accesslog", ignore=400) params = "{\"mappings\":{\"logs_june\":{\"_timestamp\": {\"enabled\": \"true\"},\"properties\":{\"logdate\":{\"type\":

python elasticsearch client set mappings during create index

痞子三分冷 提交于 2019-11-28 19:48:43
问题 I can set mappings of index being created in curl command like this: { "mappings":{ "logs_june":{ "_timestamp":{ "enabled":"true" }, "properties":{ "logdate":{ "type":"date", "format":"dd/MM/yyy HH:mm:ss" } } } } } But I need to create that index with elasticsearch client in python and set mappings.. what is the way ? I tried somethings below but not work: self.elastic_con = Elasticsearch([host], verify_certs=True) self.elastic_con.indices.create(index="accesslog", ignore=400) params = "{\

How to update a document using elasticsearch-py?

北战南征 提交于 2019-11-28 05:44:39
Does anyone have an example for how to use update? It's documented here , but the documentation is unclear and doesn't included a working example. I've tried the following: coll = Elasticsearch() coll.update(index='stories-test',doc_type='news',id=hit.meta.id, body={"stanford": 1, "parsed_sents": parsed }) and I get elasticsearch.exceptions.RequestError: TransportError(400, u'ActionRequestValidationException[Validation Failed: 1: script or doc is missing;]') I would like to update using a partial doc, but the update method doesn't take any argument named 'doc' or 'document'. You're almost

How to update a document using elasticsearch-py?

▼魔方 西西 提交于 2019-11-27 01:14:42
问题 Does anyone have an example for how to use update? It's documented here, but the documentation is unclear and doesn't included a working example. I've tried the following: coll = Elasticsearch() coll.update(index='stories-test',doc_type='news',id=hit.meta.id, body={"stanford": 1, "parsed_sents": parsed }) and I get elasticsearch.exceptions.RequestError: TransportError(400, u'ActionRequestValidationException[Validation Failed: 1: script or doc is missing;]') I would like to update using a