Storm topology deployment timeout

折月煮酒 提交于 2020-01-06 04:59:05

问题


I'm trying to setup Apache Storm (1.0.2) on my Macbook Pro but apparently running into timeout issues if I try to deploy the topology. Also the UI hangs up spitting the same exception.

3491 [main] INFO  o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -8915636774701640550:-6510752657961785886
3580 [main] INFO  o.a.s.s.a.AuthUtils - Got AutoCreds []
Exception in thread "main" java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Operation timed out (Connection timed out)
    at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64)
    at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56)
    at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:99)
    at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69)
    at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:106)
    at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:78)
    at org.apache.storm.StormSubmitter.topologyNameExists(StormSubmitter.java:371)
    at org.apache.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:233)
    at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:311)
    at org.apache.storm.StormSubmitter.submitTopology(StormSubmitter.java:157)
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: Operation timed out (Connection timed out)
    at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226)
    at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
    at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:103)
    at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53)
    ... 9 more
Caused by: java.net.ConnectException: Operation timed out (Connection timed out)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221)
    ... 12 more

I'm using the default storm.yaml configuration from the github repository; without any change and default zoo.cfg file for zookeeper as well.

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=5
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=2
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/tmp/zookeeper
# the port at which the clients will connect
clientPort=2181
clientPortAddress=localhost
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

I came across similar issues which prompted me to check my hosts file; which I've posted as below

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##
255.255.255.255 broadcasthost
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         ip6-localhost ip6-localhost.localdomain localhost6 localhost6.localdomain6

When I start the zookeeper server; I believe it get's started as usual.

2017-11-27 16:05:14,314 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration from: /Users/aniket.alhat/Tools/zookeeper/bin/../conf/zoo.cfg
2017-11-27 16:05:14,318 [myid:] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2017-11-27 16:05:14,318 [myid:] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2017-11-27 16:05:14,318 [myid:] - INFO  [main:DatadirCleanupManager@101] - Purge task is not scheduled.
2017-11-27 16:05:14,318 [myid:] - WARN  [main:QuorumPeerMain@113] - Either no config or no quorum defined in config, running  in standalone mode
2017-11-27 16:05:14,329 [myid:] - INFO  [main:QuorumPeerConfig@103] - Reading configuration from: /Users/aniket.alhat/Tools/zookeeper/bin/../conf/zoo.cfg
2017-11-27 16:05:14,330 [myid:] - INFO  [main:ZooKeeperServerMain@95] - Starting server
2017-11-27 16:05:14,335 [myid:] - INFO  [main:Environment@100] - Server environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:host.name=10.9.157.77
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.version=1.8.0_131
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.vendor=Oracle Corporation
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home/jre
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.class.path=/Users/aniket.alhat/Tools/zookeeper/bin/../build/classes:/Users/aniket.alhat/Tools/zookeeper/bin/../build/lib/*.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../lib/log4j-1.2.16.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../lib/jline-0.9.94.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../zookeeper-3.4.6.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../src/java/lib/*.jar:/Users/aniket.alhat/Tools/zookeeper/bin/../conf:
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.library.path=/Users/aniket.alhat/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.io.tmpdir=/var/folders/9c/g5cj60_j1x344r3zpd_hr99j5jwnk4/T/
2017-11-27 16:05:14,336 [myid:] - INFO  [main:Environment@100] - Server environment:java.compiler=<NA>
2017-11-27 16:05:14,337 [myid:] - INFO  [main:Environment@100] - Server environment:os.name=Mac OS X
2017-11-27 16:05:14,337 [myid:] - INFO  [main:Environment@100] - Server environment:os.arch=x86_64
2017-11-27 16:05:14,338 [myid:] - INFO  [main:Environment@100] - Server environment:os.version=10.12.6
2017-11-27 16:05:14,338 [myid:] - INFO  [main:Environment@100] - Server environment:user.name=aniket.alhat
2017-11-27 16:05:14,338 [myid:] - INFO  [main:Environment@100] - Server environment:user.home=/Users/aniket.alhat
2017-11-27 16:05:14,338 [myid:] - INFO  [main:Environment@100] - Server environment:user.dir=/Users/aniket.alhat/Tools/zookeeper-3.4.6
2017-11-27 16:05:14,344 [myid:] - INFO  [main:ZooKeeperServer@755] - tickTime set to 2000
2017-11-27 16:05:14,344 [myid:] - INFO  [main:ZooKeeperServer@764] - minSessionTimeout set to -1
2017-11-27 16:05:14,344 [myid:] - INFO  [main:ZooKeeperServer@773] - maxSessionTimeout set to -1
2017-11-27 16:05:14,361 [myid:] - INFO  [main:NIOServerCnxnFactory@94] - binding to port localhost/127.0.0.1:2181

And I also don't see any errors in nimbus log

2017-11-27 16:05:35.365 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:host.name=10.49.48.134
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.version=1.8.0_131
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.vendor=Oracle Corporation
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home/jre
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.class.path=/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/asm-5.0.3.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/clojure-1.7.0.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/disruptor-3.3.2.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/kryo-3.0.3.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/log4j-api-2.1.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/log4j-core-2.1.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/log4j-over-slf4j-1.6.6.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/log4j-slf4j-impl-2.1.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/minlog-1.3.0.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/objenesis-2.1.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/reflectasm-1.10.1.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/servlet-api-2.5.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/slf4j-api-1.7.7.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/storm-core-1.0.2.jar:/Users/aniket.alhat/Tools/apache-storm-1.0.2/lib/storm-rename-hack-1.0.2.jar:/Users/aniket.alhat/Tools/storm/conf
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.io.tmpdir=/var/folders/9c/g5cj60_j1x344r3zpd_hr99j5jwnk4/T/
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:java.compiler=<NA>
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.name=Mac OS X
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.arch=x86_64
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:os.version=10.12.6
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.name=aniket.alhat
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.home=/Users/aniket.alhat
2017-11-27 16:05:35.373 o.a.s.s.o.a.z.ZooKeeper [INFO] Client environment:user.dir=/Users/aniket.alhat/Tools/apache-storm-1.0.2
2017-11-27 16:05:35.374 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=localhost:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@eac3a26
2017-11-27 16:05:35.397 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-11-27 16:05:35.400 o.a.s.b.FileBlobStoreImpl [INFO] Creating new blob store based in storm-local/blobs
2017-11-27 16:05:35.406 o.a.s.d.nimbus [INFO] Using default scheduler
2017-11-27 16:05:35.408 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2017-11-27 16:05:35.409 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=localhost:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@68868328
2017-11-27 16:05:35.411 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-11-27 16:05:35.438 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2017-11-27 16:05:35.438 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=localhost:2181 sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@512d6e60
2017-11-27 16:05:35.440 o.a.s.s.o.a.z.ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-11-27 16:05:35.478 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-11-27 16:05:35.478 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-11-27 16:05:35.479 o.a.s.s.o.a.z.ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-11-27 16:05:35.513 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15ffc4b4d950000, negotiated timeout = 20000
2017-11-27 16:05:35.513 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15ffc4b4d950002, negotiated timeout = 20000
2017-11-27 16:05:35.513 o.a.s.s.o.a.z.ClientCnxn [INFO] Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15ffc4b4d950001, negotiated timeout = 20000
2017-11-27 16:05:35.517 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2017-11-27 16:05:35.517 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2017-11-27 16:05:35.517 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED
2017-11-27 16:05:35.518 o.a.s.zookeeper [INFO] Zookeeper state update: :connected:none
2017-11-27 16:05:35.518 o.a.s.zookeeper [INFO] Zookeeper state update: :connected:none
2017-11-27 16:05:35.531 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] backgroundOperationsLoop exiting
2017-11-27 16:05:35.534 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: 0x15ffc4b4d950002 closed
2017-11-27 16:05:35.534 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down
2017-11-27 16:05:35.536 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
2017-11-27 16:05:35.536 o.a.s.s.o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=localhost:2181/storm sessionTimeout=20000 watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@3722c145

I would really appreciate if I could get some help to fix the timeout issue.


回答1:


Check if the Nimbus is started correctly. I faced similar issue when an instance of Nimbus was not terminated correctly.

Try to kill the process and restart Nimbus.




回答2:


After lot of trial-and-error I discovered that my Nimbus process gets started with a IP address 10.9.157.77 while ifconfig gives me 10.49.52.97 not sure why/how this is happening, I'll really appreciate if someone can help me figure it out.

nimbus.log

2017-11-30 16:47:00.342 o.a.s.zookeeper [INFO] 10.9.157.77 gained leadership, checking if it has all the topology code locally.
2017-11-30 16:47:00.350 o.a.s.zookeeper [INFO] active-topology-ids [] local-topology-ids [] diff-topology []
2017-11-30 16:47:00.350 o.a.s.zookeeper [INFO] Accepting leadership, all active topology found localy.
2017-11-30 16:47:00.352 o.a.s.d.m.MetricsUtils [INFO] Using statistics reporter plugin:org.apache.storm.daemon.metrics.reporters.JmxPreparableReporter

ifconfig

en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
    ether xx:xx:xx:xx:xx:xx
    inet6 fe80::427:8998:bb4d:b2bd%en0 prefixlen 64 secured scopeid 0x4
    inet 10.49.52.97 netmask 0xfffffc00 broadcast 10.49.55.255

Continuing forward I found everytime I was starting the nimbus process IP address 10.9.157.77 was been acquired magically and was stored in zookeeper as well.

[zk: localhost:2181(CONNECTED) 15] ls /storm/nimbuses
[10.9.157.77:6627]

I cleaned /storm directory with rmr and restart nimbus creating the directory once again, but there was no change.

I also tried flushing DNS cache, command used was sudo killall -HUP mDNSResponder

I also observed that the IP magical IP wasn't same after restarts, it changed to 10.49.48.134

2017-11-30 17:28:59.630 o.a.s.zookeeper [INFO] 10.49.48.134 gained leadership, checking if it has all the topology code locally.
2017-11-30 17:28:59.646 o.a.s.zookeeper [INFO] active-topology-ids [] local-topology-ids [] diff-topology []
2017-11-30 17:28:59.646 o.a.s.zookeeper [INFO] Accepting leadership, all active topology found localy. 

Later I disconnected from Wifi and started everything once again and I was able to start Storm UI run command storm list deploy topology locally.




回答3:


You can add storm.local.hostname= at your storm/conf/storm.yaml, and restart. Also work with IPv4/FQDN and not IPv6. This worked for me (same Storm 1.0.2)

If there are still problems, you can also add nimbus.seeds= with the Nimbus's host.



来源:https://stackoverflow.com/questions/47505756/storm-topology-deployment-timeout

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!