apache-flink | 易学教程

Error in sinking kafka stream in flink 1.2

阅读更多关于 Error in sinking kafka stream in flink 1.2

问题 What I did was to read in a message from kafka in json format. E.g. {"a":1,"b":2} Then I applied a filter to this message to make sure the value corresponding to a is 1, the value of b is 2. Finally, I want to output the result stream to a downstream kafka. However, I don't know why the compiler says type mismatch. My code is as follows: val kafkaConsumer = new FlinkKafkaConsumer010( params.getRequired("input-topic"), new JSONDeserializationSchema(), params.getProperties) val messageStream =

Flink 1.2 does not start in HA Cluster mode

阅读更多关于 Flink 1.2 does not start in HA Cluster mode

问题 I've installed Flink 1.2 in HA cluster mode 2 JobManagers 1 TaskManager locally and it kept refusing to actually start in this mode showing "Starting cluster." message instead of "Starting HA cluster with 2 masters and 1 peers in ZooKeeper quorum." Apparently in the bin/config.sh it reads the configuration like: # High availability if [ -z "${HIGH_AVAILABILITY}" ]; then HIGH_AVAILABILITY=$(readFromConfig ${KEY_HIGH_AVAILABILITY} "" "${YAML_CONF}") if [ -z "${HIGH_AVAILABILITY}" ]; then # Try

Apache Flink read Avro byte[] from Kafka

阅读更多关于 Apache Flink read Avro byte[] from Kafka

问题 In reviewing examples I see alot of this: FlinkKafkaConsumer08<Event> kafkaConsumer = new FlinkKafkaConsumer08<>("myavrotopic", avroSchema, properties); I see that they here already know the schema. I do not know the schema until I read the byte[] into a Generic Record then get the schema. (As it may change from record to record) Can someone point me into a FlinkKafkaConsumer08 that reads from byte[] into a map filter so that I can remove some leading bits, then load that byte[] into a

Is that possible to have the same behavior of CoFlatMapFunction using other basic operators?

阅读更多关于 Is that possible to have the same behavior of CoFlatMapFunction using other basic operators?

问题 Basically, I am using CoFlatMapFunction (https://ci.apache.org/projects/flink/flink-docs-stable/api/java/org/apache/flink/streaming/api/functions/co/CoFlatMapFunction.html) to filter a stream and change the filter parameters at runtime and I am using Flink for that. I want to do the same using Apache Edgent TStream (https://edgent.incubator.apache.org/javadoc/latest/org/apache/edgent/topology/TStream.html), but it does not have CoFlatMapFunction. If I use Union it will not work because the

How to assign groups of messages to windows by detecting the first message of a group?

阅读更多关于 How to assign groups of messages to windows by detecting the first message of a group?

问题 I have the following problem: I receive messages which have to be grouped and each group of messages has to be processed. I can only detect the first message of each group. After that specific first message, the following messages belong to that group until the first message of the next group has been detected. My approach to solving that problem was to write a custom trigger that returns FIRE_PURGE when he detects the first message of a group (by overriding onElement). My goal was to assign

Apache flink (Stable version 1.6.2) does not work

阅读更多关于 Apache flink (Stable version 1.6.2) does not work

问题 Recently, the stable version (1.6.2) of apache flink was released. I read this instruction. But when I run the following command: ./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000 I get the following error: The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 264564a337d4c6705bde681b34010d28) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) at org

Missing Dependencies in Eclipse IDE with Flink Quickstart

阅读更多关于 Missing Dependencies in Eclipse IDE with Flink Quickstart

问题 I have cloned Flink Training repo and followed instructions on building and deploying from here in order to get familiar with Apache Flink. However, there are the errors in the projects after building and importing into Eclipse IDE. In the Flink Training Exercises project i find errors in the pom Plugin execution not covered by lifecycle configuration: net.alchim31.maven:scala-maven-plugin:3.1.4:testCompile . There are also errors in the project flink-quickstart-java . Some dependencies are

Flink Kafka - Custom Class Data is always null

阅读更多关于 Flink Kafka - Custom Class Data is always null

问题 Custom Class Person class Person { private Integer id; private String name; //getters and setters } Kafka Flink Connector TypeInformation<Person> info = TypeInformation.of(Person.class); TypeInformationSerializationSchema schema = new TypeInformationSerializationSchema(info, new ExecutionConfig()); DataStream<Person> input = env.addSource( new FlinkKafkaConsumer08<>("persons", schema , getKafkaProperties())); Now if I send the below json { "id" : 1, "name": Synd } through Kafka Console

Check serialization method

阅读更多关于 Check serialization method

问题 I am running a program on Apache Flink. I got this error: Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated due to an exception: Serializer consumed more bytes than the record had. This indicates broken serialization. If you are using custom serialization types (Value or Writable), check their serialization methods. If you are using a Kryo-serialized type, check the corresponding Kryo serializer. How can I check the serialization method of an object in Scala/Java?

An exponentially decaying moving average over a hopping window in Flink SQL: Casting time

阅读更多关于 An exponentially decaying moving average over a hopping window in Flink SQL: Casting time

问题 Now we have SQL with fancy windowing in Flink, I'm trying to have the decaying moving average referred by "what will be possible in future Flink releases for both the Table API and SQL." from their SQL roadmap/preview 2017-03 post: table .window(Slide over 1.hour every 1.second as 'w) .groupBy('productId, 'w) .select( 'w.end, 'productId, ('unitPrice * ('rowtime - 'w.start).exp() / 1.hour).sum / (('rowtime - 'w.start).exp() / 1.hour).sum) Here is my attempt (inspired as well by the calcite