what is the better way to index data from Oracle/relational tables into elastic search?

别说谁变了你拦得住时间么 提交于 2019-12-31 17:46:06

问题


What are the options to index large data from Oracle DB to elastic search cluster? Requirement is to index 300Million records one time into multiple indexes and also incremental updates having around approximate 1 Million changes every day.

I have tried JDBC plugin for elasticsearch river/feeder, both seems to be running inside or require locally running elastic search instance. Please let me know if there is any better option for running elastic search indexer as a standalone job (probably java based). Any suggestions will be very helpful. Thanks.


回答1:


We use ES as a reporting db and when new records are written to SQL we take the following action to get them into ES:

  1. Write the primary key into a queue (we use rabbitMQ)
  2. Rabbit picks up the primary key (when it has time) and queries the relation DB to get the info it needs and then writes the data into ES

This process works great because it handles both new data and old data. For old data just write a quick script to write 300M primary keys into rabbit and you're done!




回答2:


there are many integration options - I've listed out a few to give you some ideas, the solution is really going to depend on your specific resources and requirements though.

  1. Oracle Golden Gate will look at the Oracle DB transaction logs and feed them in real-time to ES.
  2. ETL for example Oracle Data Integrator could run on a schedule and pull data from your DB, transform it and send to ES.
  3. Create triggers in the Oracle DB so that data updates can be written to ES using a stored procedure. Or use the trigger to write flags to a "changes" table that some external process (e.g. a Java application) monitors and uses to extract data from the Oracle DB.
  4. Get the application that writes to the Oracle DB to also feed ES. Ideally your application and Oracle DB should be loosely coupled - do you have an integration platform that can feed the messages to both ES and Oracle?


来源:https://stackoverflow.com/questions/27486119/what-is-the-better-way-to-index-data-from-oracle-relational-tables-into-elastic

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!