问题
I'm new in storm and trying to submit a topology and found this in supervisor I found this in log file of workers
[ERROR] Async loop died!
java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused
at backtype.storm.drpc.DRPCInvocationsClient.<init>(DRPCInvocationsClient.java:23)
at backtype.storm.drpc.DRPCSpout.open(DRPCSpout.java:69)
at storm.trident.spout.RichSpoutBatchTriggerer.open(RichSpoutBatchTriggerer.java:41)
at backtype.storm.daemon.executor$fn__3985$fn__3997.invoke(executor.clj:460)
at backtype.storm.util$async_loop$fn__465.invoke(util.clj:375)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.thrift7.transport.TTransportException: java.net.ConnectException: Connection refused
log file of supervisor
supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:54 supervisor [ERROR] Error when processing event
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:72)
at com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:74)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:353)
at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:149)
at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:138)
at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:134)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:125)
at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:34)
at backtype.storm.zookeeper$exists_node_QMARK_.invoke(zookeeper.clj:78)
at backtype.storm.zookeeper$mkdirs.invoke(zookeeper.clj:88)
at backtype.storm.cluster$mk_distributed_cluster_state$reify__1996.set_ephemeral_node(cluster.clj:54)
at backtype.storm.cluster$mk_storm_cluster_state$reify__2415.supervisor_heartbeat_BANG_(cluster.clj:300)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)
and this is in the supervisor log file too
at java.lang.Thread.run(Unknown Source)
2015-09-15 02:00:54 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:55 ClientCnxn [INFO] Client session timed out, have not heard from server in 20020ms for sessionid 0x14fce3996380015, closing socket connection and attempting reconnect
2015-09-15 02:00:58 ClientCnxn [INFO] Opening socket connection to server localhost/127.0.0.1:2181
2015-09-15 02:00:58 ClientCnxn [INFO] Socket connection established to localhost/127.0.0.1:2181, initiating session
2015-09-15 02:00:59 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:01:01 supervisor [INFO] ff6460a5-aafb-44a4-a49c-2de945ffd572 still hasn't started
2015-09-15 02:00:59 util [INFO] Halting process: ("Error when processing an event")
回答1:
There are many possible reasons for this issue.
- zookeeper is not started.
- CPU get to peak for a while, no heartbeat send in the timeout, so nimbus think the supervisor is dead, the disconnect the connection.
- worker timeout is too short, maybe the default is 10sec, you can change it to 600 or more to try. it's almost like #2.
- Make sure nimbus is working fine.
- worker.childopts is not correct, it means the memory setting is not correct, change the xmx and maxpermsize try again.
- if you start the storm with winrm or powershell, maybe the default memory is not enough, since the default memory is only 1024M, you need to set more, such as 2048M to try.
来源:https://stackoverflow.com/questions/32612810/exception-after-submitting-topology