Index JSON files in elasticsearch using Python?

大兔子大兔子 提交于 2020-07-06 17:42:27

问题


I have a bunch of JSON files(100), which are named as merged_file 1.json, merged_file 2. json and so on.

How do I index all these files into elasticsearch using python(elasticsearch_dsl) ?

I am using this code, but it doesn't seem to work:

from elasticsearch_dsl import Elasticsearch
import json
import os
import sys

es = Elasticsearch()

json_docs =[]

directory = sys.argv[1]

for filename in os.listdir(directory):
    if filename.endswith('.json'):
        with open(filename,'r') as open_file:
            json_docs.append(json.load(open_file))

es.bulk("index_name", "type_name", json_docs)

The JSON looks like this:

{"one":["some data"],"two":["some other data"],"three":["other data"]}

What can I do to make this correct ?


回答1:


For this task you should be using elasticsearch-py (pip install elasticsearch):

from elasticsearch import Elasticsearch, helpers
import sys, json

es = Elasticsearch()

def load_json(directory):
    " Use a generator, no need to load all in memory"
    for filename in os.listdir(directory):
        if filename.endswith('.json'):
            with open(filename,'r') as open_file:
                yield json.load(open_file)

helpers.bulk(es, load_json(sys.argv[1]), index='my-index', doc_type='my-type')


来源:https://stackoverflow.com/questions/43981275/index-json-files-in-elasticsearch-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!