How to export pandas data to elasticsearch?

前端 未结 4 2081
独厮守ぢ
独厮守ぢ 2021-02-20 00:26

It is possible to export a pandas dataframe data to elasticsearch using elasticsearch-py. For example, here is some code:

https://www.analyticsvidhya.com/bl

相关标签:
4条回答
  • 2021-02-20 00:40

    The following script works for localhost:

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
    
    INDEX="dataframe"
    TYPE= "record"
    
    def rec_to_actions(df):
        import json
        for record in df.to_dict(orient="records"):
            yield ('{ "index" : { "_index" : "%s", "_type" : "%s" }}'% (INDEX, TYPE))
            yield (json.dumps(record, default=int))
    
    from elasticsearch import Elasticsearch
    e = Elasticsearch() # no args, connect to localhost:9200
    if not e.indices.exists(INDEX):
        raise RuntimeError('index does not exists, use `curl -X PUT "localhost:9200/%s"` and try again'%INDEX)
    
    r = e.bulk(rec_to_actions(df)) # return a dict
    
    print(not r["errors"])
    

    Verify using curl -g 'http://localhost:9200/dataframe/_search?q=A:[29%20TO%2039]'

    There are many little things that can be added to suit different needs but main is there.

    0 讨论(0)
  • 2021-02-20 00:41

    may you can use

    pip install es_pandas
    pip install progressbar2
    

    This package should work on Python3(>=3.4) and ElasticSearch should be version 5.x, 6.x or 7.x.

    import time
    import pandas as pd
    from es_pandas import es_pandas
    
    
    # Information of es cluseter
    es_host = 'localhost:9200'
    index = 'demo'
    
    # crete es_pandas instance
    ep = es_pandas(es_host)
    
    # Example data frame
    df = pd.DataFrame({'Alpha': [chr(i) for i in range(97, 128)], 
                        'Num': [x for x in range(31)], 
                        'Date': pd.date_range(start='2019/01/01', end='2019/01/31')})
    
    # init template if you want
    doc_type = 'demo'
    ep.init_es_tmpl(df, doc_type)
    
    # Example of write data to es, use the template you create
    ep.to_es(df, index, doc_type=doc_type)
    # set use_index=True if you want to use DataFrame index as records' _id
    ep.to_es(df, index, doc_type=doc_type, use_index=True)
    

    here is the document https://pypi.org/project/es-pandas/
    if 'es_pandas' cann't solve you problem,you could see other solution : https://towardsdatascience.com/exporting-pandas-data-to-elasticsearch-724aa4dd8f62

    0 讨论(0)
  • 2021-02-20 00:43

    You could use elasticsearch-py or if you won't use elasticsearch-py you may find answer to your question here => index-a-pandas-dataframe-into-elasticsearch-without-elasticsearch-py

    0 讨论(0)
  • 2021-02-20 00:51

    I'm not aware of any to_elastic method integrated in pandas. You can always raise an issue on the pandas github repo or create a pull request.

    However, there is espandas which allows to import a pandas DataFrame to elasticsearch. The following example from the README has been tested with Elasticsearch 6.2.1.

    import pandas as pd
    import numpy as np
    from espandas import Espandas
    
    df = (100 * pd.DataFrame(np.round(np.random.rand(100, 5), 2))).astype(int)
    df.columns = ['A', 'B', 'C', 'D', 'E']
    df['indexId'] = (df.index + 100).astype(str)
    
    INDEX = 'foo_index'
    TYPE = 'bar_type'
    esp = Espandas()
    esp.es_write(df, INDEX, TYPE)
    

    Retrieving the mappings with GET foo_index/_mappings:

    {
      "foo_index": {
        "mappings": {
          "bar_type": {
            "properties": {
              "A": {
                "type": "long"
              },
              "B": {
                "type": "long"
              },
              "C": {
                "type": "long"
              },
              "D": {
                "type": "long"
              },
              "E": {
                "type": "long"
              },
              "indexId": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          }
        }
      }
    }
    
    0 讨论(0)
提交回复
热议问题