apache-storm

Apache Storm : Metrics log file is empty

天涯浪子 提交于 2019-12-21 06:45:45
问题 I am trying to follow the example here https://www.endgame.com/blog/storm-metrics-how here is my storm.yaml storm.zookeeper.servers: - localhost supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 - 6704 nimbus.host: localhost ui.port: 8080 ui.host: localhost storm.log.dir: /path/to/storm/logdir topology.max.spout.pending: 5000 I tried running the topology in local and cluster mode. the metrics.log file is created at the location /path/to/storm/logdir but the file is empty! am i missing some

Move data from oracle to HDFS, process and move to Teradata from HDFS

扶醉桌前 提交于 2019-12-21 04:41:07
问题 My requirement is to Move data from Oracle to HDFS Process the data on HDFS Move processed data to Teradata. It is also required to do this entire processing every 15 minutes. The volume of source data may be close to 50 GB and the processed data also may be the same. After searching a lot on the internet, i found that ORAOOP to move data from Oracle to HDFS (Have the code withing the shell script and schedule it to run at the required interval). Do large scale processing either by Custom

How to monitor the size of Bolt's pending queues?

℡╲_俬逩灬. 提交于 2019-12-21 02:59:29
问题 My topology has a bottleneck or two. The capacity metric in the Storm UI is useful for identifying these, but I'd be much more interested in the size of Bolt's queues. My understanding is that each bolt has two queues, one for tuples pending to be executed, and another for tuple pending to be emitted. Is it possible to monitor the size of these queues? I found some stuff online about adding an ITaskHook implementation to Bolts, but it's not remotely clear how I can use this to monitor queue

Storm-Kafka multiple spouts, how to share the load?

巧了我就是萌 提交于 2019-12-20 12:38:09
问题 I am trying to share the task among the multiple spouts. I have a situation, where I'm getting one tuple/message at a time from external source and I want to have multiple instances of a spout, main intention behind is to share the load and increase performance efficiency. I can do the same with one Spout itself, but I want to share the load across multiple spouts. I am not able to get the logic to spread the load. Since the offset of messages will not be known until the particular spout

Import a project into another imported project

半城伤御伤魂 提交于 2019-12-20 07:37:56
问题 I found this statement in the help of a project that I want to import it named "storm- election" This is a simple demo app based on the storm-starter project. https://github.com/nathanmarz/storm-starter. So, I imported storm-starter project successfully. How can I import storm-election project? Can I import a project on another imported project ? 回答1: You cannot create a project under any project, like you create folder(s) under folder, folder(s) under project, file(s) under project and file

In storm, how to specify specific version of python

大憨熊 提交于 2019-12-20 06:13:40
问题 I'm trying to run a topology in storm that makes calls to python (ex: WordCountTopology) but I encounter errors that are related to the fact that python3.5.2 is the default python on my server (errors are about the old/new syntax of print command). How to specify to storm to use python2.7 instead of python3.5? Setting a python alias to python2.7 does not change anything. Any help appreciated. 回答1: I guess you're using ShellSpout / ShellBolt. In constructor you can specify the command to

IRichBolt Error when running topology on storm-1.0.0 and pyleus-0.3.0

不羁的心 提交于 2019-12-20 05:14:10
问题 I'm running storm topology " pyleus --verbose local xyz_topology.jar " using storm-1.0.0, pyleus-0.3.0, centos-6.6 and getting the Error Exception in thread "main" java.lang.NoClassDefFoundError: backtype/storm/topology/IRichBolt Running: java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/local/apache-storm-1.0.0 -Dstorm.log.dir=/usr/local/apache-storm-1.0.0/logs -Djava.library.path=/usr/local/ lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/local/apache-storm-1.0.0/lib

Storm command fails with NoClassDefFoundError after adding jsoup as provided dependency

≯℡__Kan透↙ 提交于 2019-12-19 08:54:27
问题 I'm using JSoup in my project and I've declared the dependency in my POM file. It compiles just fine and runs fine too, but only when I used the jar with all dependencies and change the have the scope of the dependency to compiled . If I change this scope to provided , then I can still compile just fine, but not run it. It gives me the ClassNotFoundException . I have included the necessary JAR file in the classpath and also the path variables but I'm still facing this problem. I can get

How to integrate Storm and Kafka [closed]

南笙酒味 提交于 2019-12-19 04:29:05
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . I have worked in Storm and developed a basic program which is using a local text file as input source. But now I have to work on streaming data coming continuously from external systems. For this purpose, Kafka is the best choice. The problem is how to make my Spout get streaming data from Kafka. Or how to

Storm : Spout for reading data from a port

限于喜欢 提交于 2019-12-19 03:08:27
问题 I need to write a storm spout for reading data from a port. Wanted to know if that was logically possible. With that in mind, I had designed a simple topology designed for the same with one spout and one bolt. The spout would gather HTTP requests sent using wget and the bolt would display the request-Just that. My spout structure is as follows: public class ProxySpout extends BaseRichSpout{ //The O/P collector SpoutOutputCollector sc; //The socket Socket clientSocket; //The server socket