how to handle execution timeout in flink

我的梦境 提交于 2019-12-08 07:24:27

问题


        Connected to JobManager at Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-1119198862] with leader session id 00000000-0000-0000-0000-000000000000.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Couldn't retrieve the JobExecutionResult from the JobManager.
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478)
    at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:429)
    at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
    at FileSetWordCount.main(FileSetWordCount.java:178)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
    at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
    at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
    at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
    at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
    at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
    at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
    at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
    at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Couldn't retrieve the JobExecutionResult from the JobManager.
    at org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:309)
    at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
    ... 23 more
Caused by: org.apache.flink.runtime.client.JobClientActorSubmissionTimeoutException: Job submission to the JobManager timed out. You may increase 'akka.client.timeout' in case the JobManager needs more time to configure and confirm the job submission.
    at org.apache.flink.runtime.client.JobSubmissionClientActor.handleCustomMessage(JobSubmissionClientActor.java:119)
    at org.apache.flink.runtime.client.JobClientActor.handleMessage(JobClientActor.java:251)
    at org.apache.flink.runtime.akka.FlinkUntypedActor.handleLeaderSessionID(FlinkUntypedActor.java:89)
    at org.apache.flink.runtime.akka.FlinkUntypedActor.onReceive(FlinkUntypedActor.java:68)
    at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
    at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
    at akka.actor.ActorCell.invoke(ActorCell.scala:487)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
    at akka.dispatch.Mailbox.run(Mailbox.scala:220)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
No JobSubmissionResult returned, please make sure you called ExecutionEnvironment.execute()

This is an exception threw by Flink. How to handle the client timeout exception? This Flink application is going to run in the local envirionment. The application is used for about 1TB files processing.


回答1:


I have figure it out. It is necessary to modify the 'flink-conf.yaml' file with adding a new line "akka.client.timeout: xx s"(e.g. "akka.client.timeout: 600 s").



来源:https://stackoverflow.com/questions/47283186/how-to-handle-execution-timeout-in-flink

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!