apache-flink | 易学教程

Flink streaming: how to control the execution time

阅读更多关于 Flink streaming: how to control the execution time

问题 Spark streaming provides API for termination awaitTermination(). Is there any similar API available to gracefully shut down flink streaming after some t seconds? 回答1: Your driver program (i.e. the main method) in Flink doesn't stay running while the streaming job executes. Your program should define a dataflow, call execute , and then terminate. In Spark, the driver program stays running (AFAIK), and awaitTermination relates to that. Note that a Flink streaming dataflow continues to execute

Flink streaming: how to control the execution time

阅读更多关于 Flink streaming: how to control the execution time

Apache Beam with Flink backend throws NoSuchMethodError on calls to protobuf-java library methods

阅读更多关于 Apache Beam with Flink backend throws NoSuchMethodError on calls to protobuf-java library methods

问题 I'm trying to run a simple pipeline on local cluster using Protocol Buffer to pass data between Beam functions. The com.google.protobuf:protobuf-java is included in FatJar. Everything works fine if I run it through: java -jar target/dataflow-test-1.0-SNAPSHOT.jar \ --runner=org.apache.beam.runners.flink.FlinkRunner \ --input=/tmp/kinglear.txt --output=/tmp/wordcounts.txt But it fails when trying to run on flink cluster: flink run target/dataflow-test-1.0-SNAPSHOT.jar \ --runner=org.apache

Flink 1.7.0 Dashboard not show Task Statistics

阅读更多关于 Flink 1.7.0 Dashboard not show Task Statistics

问题 I use Flink 1.7 dashboard and select a streaming job. This should show me some metrics, but it remains to load. I deployed the same job in a Flink 1.5 cluster, and I can watch the metrics. Flink is running in docker swarm, but if I run Flink 1.7 in docker-compose (not in the swarm), it works I can do it work, deleting the hostname in docker-compose.yaml file version: "3" services: jobmanager17: image: flink:1.7.0-hadoop27-scala_2.11 hostname: "{{.Node.Hostname}}" ports: - "8081:8081" - "9254

Flink Xpack ElasticSearch 5 ElasticsearchSecurityException missing autentication

阅读更多关于 Flink Xpack ElasticSearch 5 ElasticsearchSecurityException missing autentication

问题 Goodmorning everyone. I am trying to Flink connector Elasticsearch 5.2.1 and I have problems with the authentication XPACK 回答1: Using a different transport clients is currently (March 2017, Flink 1.2) not supported in Flink. However, I've filed a JIRA to add the feature: FLINK-6065 Make TransportClient for ES5 pluggable Until this has been implemented into Flink, I recommend overriding the ElasticsearchSink and using a different call bridge calling the PreBuiltXPackTransportClient . 来源： https

How to specify log file different from daemon log file while submitting a flink job in a standalone flink cluster

阅读更多关于 How to specify log file different from daemon log file while submitting a flink job in a standalone flink cluster

问题 When I am starting a flink standalone cluster, It logs daemon logs in a file mentioned in conf/log4j.properties file, and when I submit a flink job in that cluster, it uses same properties file to log the application logs and write into same log file on taskmanagers. I want to have separate log files for my each application submitted in that flink standalone cluster. Is there any way to achieve that 回答1: When you submit the job using the ./bin/flink shell script, use the following environment

Can anyone share a Flink Kafka example in Scala?

阅读更多关于 Can anyone share a Flink Kafka example in Scala?

问题 Can anyone share a working example of Flink Kafka (mainly receiving messages from Kafka) in Scala? I know there is a KafkaWordCount example in Spark. I just need to print out Kafka message in Flink. It would be really helpful. 回答1: The following code shows how to read from a Kafka topic using Flink's Scala DataStream API: import org.apache.flink.streaming.api.scala._ import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer082 import org.apache.flink.streaming.util.serialization

Apache Flink DataStream API doesn't have a mapPartition transformation

阅读更多关于 Apache Flink DataStream API doesn't have a mapPartition transformation

问题 Spark DStream has mapPartition API, while Flink DataStream API doesn't. Is there anyone who could help explain the reason. What I want to do is to implement a API similar to Spark reduceByKey on Flink. 回答1: Flink's stream processing model is quite different from Spark Streaming which is centered around mini batches. In Spark Streaming each mini batch is executed like a regular batch program on a finite set of data, whereas Flink DataStream programs continuously process records. In Flink's

What happen to state in Flink Task Manager when crash?

阅读更多关于 What happen to state in Flink Task Manager when crash?

问题 may i know what happen to state stored in Flink Task Manager when this Task manager crash. Say the state storage is rocksdb, would those data transfer to other running Task Manager so that complete state data is ready for data processing? 回答1: Flink does not (yet) support dynamic rescaling of state, so the failed task manager must be recovered, and the job will be restarted from a checkpoint. Exactly what that involves depends on how your cluster is configured, and whether the job failed

Flink: possible to delete Queryable state after X time?

阅读更多关于 Flink: possible to delete Queryable state after X time?

问题 In my case, I use Flink's queryable state only. In particular, I do not care about checkpoints. Upon an event, I query the queryable state only after a maximum of X minutes. Ideally, I would delete the "old" state to save on space. That's why I wonder: can I signal Flink's state to clear itself after some time? Through configuration? Through specific event signals? How? 回答1: One way to clear state is to explicitly call clear() on the state object (e.g., a ValueState object) when you no longer