How to insert data into druid via tranquility

。_饼干妹妹 提交于 2019-12-04 10:01:24

This generally happens when the data you send is out of the window period. If you are inserting data manually, give the exact current timestamp (UTC) in milliseconds. Else it can be easily done if you are using any script to generate data. Make sure it is UTC current time.

It is extremely difficult to setup druid to work properly with real-time data insertion.

The best bet I found is, use https://github.com/implydata . Imply is a set of wrappers around druid, to make it easy to use.

However, the real-time insertion in imply is not perfect either. I had experiment OutOfMemoryException, after inserting 30 millions items via real-time. This will caused data loss on previous inserted 30 millions rows.

The detailed regarding data loss can be found here : https://groups.google.com/forum/#!topic/imply-user-group/95xpYojxiOg

An issue ticket has been filed : https://github.com/implydata/distribution/issues/8

Druid streaming windowPeriod is very short (10 minutes). Outside this period, your event will be ignored.

As you got {"result":{"received":1,"sent":0}}, your worker threads are working fine. Tranquility decides what data is sent to the druid based on the timestamp associated with the data.

This period is decided by the "windowPeriod" configuration. So if your type is realtime ("type":"realtime") and window period is PT10M ("windowPeriod" : "PT10M"), tranquility will send any data between t-10, t+10 and not send anything outside this period.

I disagree with the insertion efficiency problems, we have been sending 3million rows every 15 minutes since June 2016 and has been running beautifully. Of course, we have a stronger infrastructure deemed for the scale.

Another reason for not inserting, is out memory on the coordinador/overloard are running

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!