Can I extend Flume sink to make it write different data to multiple channels?

余生颓废 提交于 2019-12-08 02:04:15

问题


A follow-up question to my previous question about Flume data flows

I want to process events and send extracted data further. I'd like to accept big sized events, like zipped html > 5KB, parse them and put many slim messages, like urls found in pages, to another channel, and also some page metrics to yet another one. Since parsing pages is resource consuming, I'd rather not replicate messages to different processors for these tasks, both of which require parsing html and building DOM in memory. Also, if possible, I'd like to avoid sending serialized DOM from parser to metrics calculators. Can I extend a sink and for every incoming events spawn multiple events to multiple outgoing channels? Something like

                 htmlChannel                urlChannel
HtmlPagesSource -------------> PageParser -------------> UrlConsumer
                    html            |          urls
                                    |
                                    | metricsChannel 
                                    -------------------> MetricsConsumer
                                         metrics

来源:https://stackoverflow.com/questions/21524300/can-i-extend-flume-sink-to-make-it-write-different-data-to-multiple-channels

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!