flink-streaming

flink kafkaproducer send duplicate message in exactly once mode when checkpoint restore

喜夏-厌秋 提交于 2021-02-08 07:27:17
问题 I am writing a case to test flink two step commit, below is overview. sink kafka is exactly once kafka producer. sink step is mysql sink extend two step commit . sink compare is mysql sink extend two step commit , and this sink will occasionally throw a exeption to simulate checkpoint failed. When checkpoint is failed and restore, I find mysql two step commit will work fine, but kafka consumer will read offset from last success and kafka producer produce messages even he was done it before

flink kafkaproducer send duplicate message in exactly once mode when checkpoint restore

二次信任 提交于 2021-02-08 07:26:20
问题 I am writing a case to test flink two step commit, below is overview. sink kafka is exactly once kafka producer. sink step is mysql sink extend two step commit . sink compare is mysql sink extend two step commit , and this sink will occasionally throw a exeption to simulate checkpoint failed. When checkpoint is failed and restore, I find mysql two step commit will work fine, but kafka consumer will read offset from last success and kafka producer produce messages even he was done it before

flink kafkaproducer send duplicate message in exactly once mode when checkpoint restore

ぃ、小莉子 提交于 2021-02-08 07:26:19
问题 I am writing a case to test flink two step commit, below is overview. sink kafka is exactly once kafka producer. sink step is mysql sink extend two step commit . sink compare is mysql sink extend two step commit , and this sink will occasionally throw a exeption to simulate checkpoint failed. When checkpoint is failed and restore, I find mysql two step commit will work fine, but kafka consumer will read offset from last success and kafka producer produce messages even he was done it before

Apache Flink Dashboard not showing metrics

喜夏-厌秋 提交于 2021-02-07 21:50:02
问题 I have the following very simple Apache Flink Pipeline for which I would like to get some metrics, as explained in the Apache Flink documentation, via the Apache Flink Dashboard: import org.apache.flink.api.common.functions.RichMapFunction; import org.apache.flink.configuration.Configuration; import org.apache.flink.metrics.Counter; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.RichSourceFunction; public

Apache Flink Dashboard not showing metrics

混江龙づ霸主 提交于 2021-02-07 21:48:29
问题 I have the following very simple Apache Flink Pipeline for which I would like to get some metrics, as explained in the Apache Flink documentation, via the Apache Flink Dashboard: import org.apache.flink.api.common.functions.RichMapFunction; import org.apache.flink.configuration.Configuration; import org.apache.flink.metrics.Counter; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.RichSourceFunction; public

Apache Flink CEP how to detect if event did not occur within x seconds?

扶醉桌前 提交于 2021-02-07 09:39:40
问题 For example A should be followed by B within 10 seconds. I know how to track if this DID occur (.next, .within), but I want to send an alert if B never happened within the window. public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // checkpointing is required for exactly-once or at-least-once guarantees // env.enableCheckpointing(1000); final RMQConnectionConfig connectionConfig = new

When to use CoProcess Function in Flink?

 ̄綄美尐妖づ 提交于 2021-02-05 09:36:33
问题 I am just trying to understand the use case when to use CoProcessFunction in Flink. Explanation with an example would help me to understand the concept better. 回答1: A CoProcessFunction is similar to a RichCoFlatMap, but with the addition of also being able to use timers. The timers are useful for expiring state for stale keys, or for raising alarms when keep alive messages fail to arrive, for example. A CoProcessFunction allows you to use one stream to influence how another is processed, or

Call http REST api from Flink using Asynchronous I/O

南楼画角 提交于 2021-01-29 13:42:53
问题 i have to fetch data from Rest API from flink process element in every data fetch from the stream, how can i achive that, i couldnt find enough meterials to call the Rest service asynchronously. please help me with some sample articles. 回答1: All the job is happening inside the asyncInvoke of the RichAsyncFunction . So, to be able to call REST service, You need to use some async HTTP client (technically it could be a synchronous client but this doesn't make sense). An example of async http

How to check DataStream in flink is empty or having data

╄→гoц情女王★ 提交于 2021-01-29 10:33:01
问题 I am new to Apache flink i have a datastream which implements a process function if certain conditions is met then the datastream is valid and if its not meeting the conditions i am writing it to sideoutput. I am able to print the datastream is it possible to check the datastream is empty or null.I tried using datastream.equals(null) method but its not working.Please suggest how to know whether a datastream is empty or not 回答1: By "empty", I assume you mean that no data is flowing. What are

About StateTtlConfig

假装没事ソ 提交于 2021-01-29 10:02:24
问题 I'm configuring my StateTtlConfig for MapState and my interest is the objects into the state has for example 3 hours of life and then they should disappear from state and passed to the GC to be cleaned up and release some memory and the checkpoints should release some weight too I think. I had this configuration before and it seems like it was not working because the checkpoints where always growing up: private final StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api