apache-flink | 易学教程

how to handle execution timeout in flink

阅读更多关于 how to handle execution timeout in flink

问题 Connected to JobManager at Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-1119198862] with leader session id 00000000-0000-0000-0000-000000000000. org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Couldn't retrieve the JobExecutionResult from the JobManager. at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478) at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105) at org

Simple Scala API for CEP example don't show any output

阅读更多关于 Simple Scala API for CEP example don't show any output

问题 I'm programming a simple example for testing the new Scala API for CEP in Flink, using the latest Github version for 1.1-SNAPSHOT. The Pattern is only a check for a value, and outputs a single String as a result for each pattern matched. Code is as follows: val pattern : Pattern[(String, Long, Int), _] = Pattern.begin("start").where(_._3 < 4) val cepEventAlert = CEP.pattern(streamingAlert, pattern) def selectFn(pattern : mutable.Map[String, (String, Long, Int)]): String = { val startEvent =

Apache Flink Using Windows to induce a delay before writing to Sink

阅读更多关于 Apache Flink Using Windows to induce a delay before writing to Sink

问题 I am wondering is possible with Flink windowing to induce a 10 minute delay from when the data enters the pipeline until it is written to a table in Cassandra. My initial intention was to write each transaction to a table in Cassandra and query the table using a range key at the web layer but due to the volume of data, I am looking at options to delay the write for N seconds. This means that my table will only ever have data that is at least 10 minutes old. The small diagram below shows 10

AWS SDK conflicts in Apache Flink enviromnent

阅读更多关于 AWS SDK conflicts in Apache Flink enviromnent

问题 I'm trying to deploy my job to Flink environment, and always get an error: java.lang.NoSuchMethodError: com.amazonaws.AmazonWebServiceRequest.putCustomQueryParameter(Ljava/lang/String;Ljava/lang/String;) I've tried to include/exclude aws-sdk from my jar, but it didn't help. Does anyone know how to resolve these conflicts ? 回答1: Apache Flink loads many classes by default into its classpath. And your problem is just with versions conflict. Please read the last section of this article https://ci

Apache Flink Streaming window WordCount

阅读更多关于 Apache Flink Streaming window WordCount

问题 I have following code to count words from socketTextStream. Both cumulate word counts and time windowed word counts are needed. The program has an issue that cumulateCounts is always the same as windowed counts. Why this issue occurs? What is the correct way to calculate cumulate counts base on windowed counts? StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); final HashMap<String, Integer> cumulateCounts = new HashMap<String, Integer>(); final DataStream

Apache Flink - how to send and consume POJOs using AWS Kinesis

阅读更多关于 Apache Flink - how to send and consume POJOs using AWS Kinesis

问题 I want to consume POJOs arriving from Kinesis with Flink. Is there any standard for how to correctly send and deserialize the messages? Thanks 回答1: I resolved it with: DataStream<SamplePojo> kinesis = see.addSource(new FlinkKinesisConsumer<>( "my-stream", new POJODeserializationSchema(), kinesisConsumerConfig)); and public class POJODeserializationSchema extends AbstractDeserializationSchema<SamplePojo> { private ObjectMapper mapper; @Override public SamplePojo deserialize(byte[] message)

FLINK: How to read from multiple kafka cluster using same StreamExecutionEnvironment

阅读更多关于 FLINK: How to read from multiple kafka cluster using same StreamExecutionEnvironment

问题 I want to read data from multiple KAFKA clusters in FLINK. But the result is that the kafkaMessageStream is reading only from first Kafka. I am able to read from both Kafka clusters only if i have 2 streams separately for both Kafka , which is not what i want. Is it possible to have multiple sources attached to single reader. sample code public class KafkaReader<T> implements Reader<T>{ private StreamExecutionEnvironment executionEnvironment ; public StreamExecutionEnvironment

Python + Beam + Flink

阅读更多关于 Python + Beam + Flink

问题 I've been trying to get the Apache Beam Portability Framework to work with Python and Apache Flink and I can't seem to find a complete set of instructions to get the environment working. Are there any references with complete list of prerequisites and steps to get a simple python pipeline working? 回答1: Overall, for local portable runner (ULR), see the wiki, quote from there: Run a Python-SDK Pipeline: Compile container as a local build: ./gradlew :beam-sdks-python-container:docker Start ULR

Flink program cannot submit when i follow flink-1.4's quickstart and use “./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000”

阅读更多关于 Flink program cannot submit when i follow flink-1.4's quickstart and use “./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000”

问题 Flink-1.4 quickstart address: https://ci.apache.org/projects/flink/flink-docs-release-1.4/quickstart/setup_quickstart.html. When I use "./bin/start-local.sh" to start flink following flink-1.4's quickstart, then i check http://localhost:8081/ and make sure everything is running, then i use "./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000" to submit .jar and i got following info, and i can't submit successfully. ---------------------------------------------------------

how to handle execution timeout in flink

阅读更多关于 how to handle execution timeout in flink

Connected to JobManager at Actor[akka.tcp://flink@localhost:6123/user/jobmanager#-1119198862] with leader session id 00000000-0000-0000-0000-000000000000. org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Couldn't retrieve the JobExecutionResult from the JobManager. at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:478) at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:105) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:442) at org.apache.flink.client.program