apache-storm-topology

Apache Storm spout stops emitting messages from spout

馋奶兔 提交于 2020-01-03 01:22:31
问题 We have been struggling with this issue for a long time now. In short, our storm topology stops emitting messages from spout after some time in a random fashion. We have an automated script which re-deploys the topology at 06:00 UTC everyday after the master data refresh activity is complete. In the last 2 weeks, our topology stopped emitting the messages for 3 times in late UTC hours (between 22:00 and 02:00). It only comes online when we restart it which is around 06:00 UTC. I've searched

Is there any Java API to know when topology is ready for reading first message from Spout

馋奶兔 提交于 2019-12-25 18:17:42
问题 Our Apache Storm topology listens messages from Kafka using KafkaSpout and after doing lot of mapping/reducing/enrichment/aggregation etc. etc finally inserts data into Cassandra. There is another kafka input where we receive user queries for data if topology finds a response then it sends that onto a third kafka topic. Now we want to write E2E test using Junit in which we can directly programmatically insert data into topology and then by inserting user query message, we can assert on third

kafkaSpout is not emitting messages

*爱你&永不变心* 提交于 2019-12-25 02:03:51
问题 I am able to run storm Kafka with local cluster but not able to run with storm Submitter below is my topology code can anyone please help me to solve this issue :) package com.org.kafka; import org.apache.storm.Config; import org.apache.storm.LocalCluster; import org.apache.storm.generated.AlreadyAliveException; import org.apache.storm.generated.AuthorizationException; import org.apache.storm.generated.InvalidTopologyException; import org.apache.storm.kafka.KafkaSpout; import org.apache.storm

Resource clean up after killing storm topology

為{幸葍}努か 提交于 2019-12-24 07:57:20
问题 We have a storm topology which interacts with a MariaDB database. Our Bolts implement the IRichBolt interface and override the lifecycle methods. We open a db connection in our prepare method and close it in the cleanup method. The cleanup method documentation says: Called when an IBolt is going to be shutdown. There is no guarentee that cleanup will be called, because the supervisor kill -9's worker processes on the cluster. The one context where cleanup is guaranteed to be called is when a

Storm causes dependency conflicts on Ignite log4j

此生再无相见时 提交于 2019-12-24 07:50:07
问题 I try to run a storm topology on a Storm cluster The topology jar is created by provided. Inside topology bolt i want to read data from MyIgniteCache module but i get such a following error. I think the dependencies (:/usr/hdp/2.6.0.3-8/storm/lib/log4j-slf4j-impl-2.8.jar:/usr/hdp/2.6.0.3-8/storm/lib/log4j-core-2.8.jar:log4j) of storm-core cause conflict over ignite-log4j. ava.lang.IncompatibleClassChangeError: Implementing class at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.8.0

How to E2E test functionality of Storm Topology by programmatically inserting messages

非 Y 不嫁゛ 提交于 2019-12-11 17:23:10
问题 Our Apache Storm topology listens messages from Kafka using KafkaSpout and after doing lot of mapping/reducing/enrichment/aggregation etc. etc finally inserts data into Cassandra. There is another kafka input where we receive user queries for data if topology finds a response then it sends that onto a third kafka topic. Now we want to write E2E test using Junit in which we can directly programmatically insert data into topology and then by inserting user query message, we can assert on third

What can be used as a test stub for CassandraWriterBolt?

。_饼干妹妹 提交于 2019-12-11 15:09:59
问题 I read a json from Kafka, FieldExtractionBolt reads that json extracts data into tuple values and passes them to CassandraWriterBolt, which in its turn writes a record in Cassandra writing all those tuple values into separate columns. JSON message on Kafka - {"pair":"GBPJPY","bid":134.4563,"ask":134.4354} FieldExtractionBolt - String message = tuple.getStringByField("message"); Map values = new Gson().fromJson(message, Map.class); basicOutputCollector.emit(new Values(values.get("pair"),

storm kafka first message is skipped during restart and first start as well

孤人 提交于 2019-12-11 14:21:45
问题 I have written a sample topology where, it will consume the message from kafka and log it. please find the code snippet below End to End topology is fine. When I post the message in Kafka Producer it's consumed properly. I simply get the message and log it in MessagePrinter. Issue described below use case 1: I have brought down the topology, sent messages 1-10, when I bring up the topology, message 2-10 is logged properly by topology and first message alone is not logged. use case 2: same

Apache Storm Kafka Spout Lag Issue

非 Y 不嫁゛ 提交于 2019-12-11 06:37:17
问题 I am building a Java Spring application using Storm 1.1.2 and Kafka 0.11 to be launched in a Docker container. Everything in my topology works as planned but under a high load from Kafka, the Kafka lag increases more and more over time. My KafkaSpoutConfig: KafkaSpoutConfig<String,String> spoutConf = KafkaSpoutConfig.builder("kafkaContainerName:9092", "myTopic") .setProp(ConsumerConfig.GROUP_ID_CONFIG, "myGroup") .setProp(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, MyObjectDeserializer