Elasticsearch connector works in IDE but not on local cluster

泪湿孤枕 提交于 2019-12-11 13:36:56

问题


I am trying to write a Twitter stream into an Elasticsearch 2.3 index using the provided Elasticsearch2 connector

Running my job in IntelliJ works fine but when I run that jar job on a local cluster I get the following error:

05/09/2016 13:26:58 Job execution switched to status RUNNING.
05/09/2016 13:26:58 Source: Custom Source -> (Sink: Unnamed, Sink: Unnamed, Sink: Unnamed)(1/1) switched to SCHEDULED 
05/09/2016 13:26:58 Source: Custom Source -> (Sink: Unnamed, Sink: Unnamed, Sink: Unnamed)(1/1) switched to DEPLOYING 
05/09/2016 13:26:58 Source: Custom Source -> (Sink: Unnamed, Sink: Unnamed, Sink: Unnamed)(1/1) switched to RUNNING 
05/09/2016 13:26:59 Source: Custom Source -> (Sink: Unnamed, Sink: Unnamed, Sink: Unnamed)(1/1) switched to FAILED 
java.lang.RuntimeException: Client is not connected to any Elasticsearch nodes!
    at org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink.open(ElasticsearchSink.java:172)
    at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:91)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:317)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:215)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
    at java.lang.Thread.run(Thread.java:745)

05/09/2016 13:26:59 Job execution switched to status FAILING.
java.lang.RuntimeException: Client is not connected to any Elasticsearch nodes!
    at org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink.open(ElasticsearchSink.java:172)
    at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:91)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:317)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:215)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
    at java.lang.Thread.run(Thread.java:745)
05/09/2016 13:26:59 Job execution switched to status FAILED.

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
    at org.apache.flink.client.program.Client.runBlocking(Client.java:381)
    at org.apache.flink.client.program.Client.runBlocking(Client.java:355)
    at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:65)
    at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:541)
    at com.pl.greeny.flink.TwitterAnalysis$.main(TwitterAnalysis.scala:69)
    at com.pl.greeny.flink.TwitterAnalysis.main(TwitterAnalysis.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:505)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403)
    at org.apache.flink.client.program.Client.runBlocking(Client.java:248)
    at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:860)
    at org.apache.flink.client.CliFrontend.run(CliFrontend.java:327)
    at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1187)
    at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1238)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:807)
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:753)
    at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:753)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.RuntimeException: Client is not connected to any Elasticsearch nodes!
    at org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink.open(ElasticsearchSink.java:172)
    at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:91)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:317)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:215)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579)
    at java.lang.Thread.run(Thread.java:745)

My code in scala:

val config = new java.util.HashMap[String, String]
      config.put("bulk.flush.max.actions", "1")
      config.put("cluster.name", "elasticsearch")
      config.put("node.name", "node-1")

      config.put("path.home", "/media/user/e5e05ab5-28f3-4cee-a57c-444e32b99f04/thesis/elasticsearch-2.3.2/bin")
      val transports = new util.ArrayList[InetSocketAddress]
      transports.add(new InetSocketAddress(InetAddress.getLocalHost(),9300))
    transports.add(new InetSocketAddress(InetAddress.getLoopbackAddress(),9300))
    transports.add(new InetSocketAddress(InetAddress.getByName("127.0.0.1"),9300))
    transports.add(new InetSocketAddress(InetAddress.getByName("localhost"),9300))
    stream.addSink(new ElasticsearchSink(config, transports, new ElasticSearchSinkTwitter()))

What is the difference between running that program from an IDE and the local cluster?


回答1:


Problems like this are often caused by the different ways that dependencies are managed / included by IDEs (IntelliJ, Eclipse) and Flink's job submission via fat jars.

I had the same problem the other day and the task manager log file revealed the following root cause:

java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'Lucene50' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath. The current classpath supports the following names: [es090, completion090, XBloomFilter]

Searching for the error I found this answer on SO that solved the issue:

https://stackoverflow.com/a/38354027/3609571

by adding the following dependency to my pom.xml:

<dependency>
    <groupId>org.apache.lucene</groupId>
    <artifactId>lucene-core</artifactId>
    <version>5.4.1</version>
 </dependency>

Note, the order of dependencies matters in this case. It only worked when putting the lucene-core dependency on top. Adding it to the end did not work for me. So this more a "hack" than a proper fix.



来源:https://stackoverflow.com/questions/37114886/elasticsearch-connector-works-in-ide-but-not-on-local-cluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!