apache-storm | 易学教程

on Storm 0.10.0 2 worker processes are launched even when I set workers=1, and UI reports that workers=1

阅读更多关于 on Storm 0.10.0 2 worker processes are launched even when I set workers=1, and UI reports that workers=1

问题 I have a storm topology for which I do: setNumWorkers(1); When I look at the storm UI report on this running topology, I see Num workers set to 1 . However, when i log into the node running the supervisor I see two processes that have the same setting for -Dworker.id and for -Dworker.port . I am including the output of what ps shows me for these two processes below. My question is: Why are there two processes that seem to be configured as worker processes if I only requested one (note: the

Storm UI throwing “Offset lags for kafka not supported for older versions. Please update kafka spout to latest version.”

阅读更多关于 Storm UI throwing “Offset lags for kafka not supported for older versions. Please update kafka spout to latest version.”

问题 I have upgraded my hdp cluster to 2.5 and upgraded topology dependencies of storm-core to 1.0.1 and storm-kafka to 1.0.1. After deploying the new topology with new 1.0.1 dependencies everything is working as expected in the back end but storm UI not showing always zero for "Acked","Emitted", "Transferred" etc. Storm UI shows a message "Offset lags for kafka not supported for older versions. Please update kafka spout to latest version." under "Topology spouts lag error" what does it mean ? 回答1

Resource clean up after killing storm topology

阅读更多关于 Resource clean up after killing storm topology

问题 We have a storm topology which interacts with a MariaDB database. Our Bolts implement the IRichBolt interface and override the lifecycle methods. We open a db connection in our prepare method and close it in the cleanup method. The cleanup method documentation says: Called when an IBolt is going to be shutdown. There is no guarentee that cleanup will be called, because the supervisor kill -9's worker processes on the cluster. The one context where cleanup is guaranteed to be called is when a

Storm causes dependency conflicts on Ignite log4j

阅读更多关于 Storm causes dependency conflicts on Ignite log4j

问题 I try to run a storm topology on a Storm cluster The topology jar is created by provided. Inside topology bolt i want to read data from MyIgniteCache module but i get such a following error. I think the dependencies (:/usr/hdp/2.6.0.3-8/storm/lib/log4j-slf4j-impl-2.8.jar:/usr/hdp/2.6.0.3-8/storm/lib/log4j-core-2.8.jar:log4j) of storm-core cause conflict over ignite-log4j. ava.lang.IncompatibleClassChangeError: Implementing class at java.lang.ClassLoader.defineClass1(Native Method) ~[?:1.8.0

How can I do a Storm spout that listen an ActiveMQ topic?

阅读更多关于 How can I do a Storm spout that listen an ActiveMQ topic?

问题 I programmed a Storm topology that listens on particular topic on kafka with its spouts. Now I have to migrate it on activeMQ. Is possible to reproduce these topics with activeMQ and create spouts that listen them as I did with kafka? I googled it, but it is not clear how I can send a message to a topic or listen a particular topic. In kafka a do something like data = new KeyedMessage<>("topic", sms); producer.send(data); to send sms on topic and just create a new kafkaspout("topic") to

How can I do a Storm spout that listen an ActiveMQ topic?

阅读更多关于 How can I do a Storm spout that listen an ActiveMQ topic?

Running Storm nimbus and supervisor on the same physical node in cluster mode

阅读更多关于 Running Storm nimbus and supervisor on the same physical node in cluster mode

问题 I've a storm cluster of 2 physical nodes right now. I'm running storm nimbus on node-1 and storm supervisor on node-2. Looks like all my topologies are running on running on node-2 (supervisor node) only. Should I run supervisor on node-1 as well ? Thanks 回答1: You could, but I wouldn't recommend it. In Storm's current design, nimbus is a single point of failure (there's plans to address this), but running a supervisor on the same node as nimbus makes it more likely that something bad might

How does Storm know when a message is “fully processed”?

阅读更多关于 How does Storm know when a message is “fully processed”?

问题 (Also a couple of questions about timeouts and maxSpoutPending) I see a lot of references in the Storm documentation about messages being fully processed. But how does my KafkaSpout know when a message is fully processed? Hopefully it is cognizant of the way my bolts are connected so when the final bolt in my Stream acks a tuple, the spout knows when my message is processed? Otherwise, I would imagine that after the timeout period expires, the ack-ed state of a message is checked, and it is

storm topology: one to many (random)

阅读更多关于 storm topology: one to many (random)

问题 I'm using the KafkaSpout spout to read from all (6) partitions on a kafka topic. The first bolt in the topology has to convert the byte stream into a struct (via IDL definition), lookup a value in a db and pass these values to a second bolt which writes it all into cassandra . There are several issues occurring: Many fail (s) from the kafka spout. The first bolt reports "capacity" of > 2.0 from the storm ui. I've tried to increase the parallelism but it appears that storm will only accept 1:1

How does Storm handle fields grouping when you add more nodes?

阅读更多关于 How does Storm handle fields grouping when you add more nodes?

问题 Just reading more details on storm and came across it's ability to do fields grouping so for example if you where counting tweets per user and you had two tasks with a fields grouping of user-id the same user-id's would get sent to the same tasks. So task 1 could have the following counts in memory bob: 10 alice: 5 task 2 could have the following counts in memory jill:10 joe: 4 If I added a new machine to the cluster to increase capacity and ran rebalance, what happens to my counts in memory?