apache-flink | 易学教程

how to run first example of Apache Flink

阅读更多关于 how to run first example of Apache Flink

问题 I am trying to run the first example from the oreilly book "Stream Processing with Apache Flink" and from the flink project. Each gives different errors Example from the book gies NoClassDefFound error Example from flink project gives java.net.ConnectException: Connection refused (Connection refused) but does create a flink job, see screenshot. Detail below Book example java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:scala/runtime/java8/JFunction1$mcVI$sp at io.github

how to run first example of Apache Flink

阅读更多关于 how to run first example of Apache Flink

Cannot launch flink from local host when trying to run it with webUI

阅读更多关于 Cannot launch flink from local host when trying to run it with webUI

问题 I'm trying to debug my flink from intellij using the flink UI. the problem it somethims doesn't launched throwing java.net.BindException: Could not start rest endpoint on any port in port range 8081 my piece of code that should let the flink ui run (from windows) is: String osName = System.getProperty("os.name"); if (osName.toLowerCase().contains("win")) { Configuration conf = new Configuration(); conf.setBoolean(ConfigConstants.LOCAL_START_WEBSERVER, true); env = StreamExecutionEnvironment

Flink - Why should I create my own RichSinkFunction instead of just open and close my PostgreSql connection?

阅读更多关于 Flink - Why should I create my own RichSinkFunction instead of just open and close my PostgreSql connection?

问题 I would like to know why I really need to create my own RichSinkFunction or use JDBCOutputFormat to connect on the database instead of just Create my connection, perform the query and close the connection using the traditional PostgreSQL drivers inside my SinkFunction? I found many articles telling do to that but does not explain why? What is the difference? Code example using JDBCOutputFormat, JDBCOutputFormat jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat() .setDrivername("org

Flink - Why should I create my own RichSinkFunction instead of just open and close my PostgreSql connection?

阅读更多关于 Flink - Why should I create my own RichSinkFunction instead of just open and close my PostgreSql connection?

flink - how to use state as cache

阅读更多关于 flink - how to use state as cache

问题 I want to read history from state. if state is null, then read hbase and update the state and using onTimer to set state ttl. The problem is how to batch read hbase, because read single record from hbase is not efficient. 回答1: In general, if you want to cache/mirror state from an external database in Flink, the most performant approach is to stream the database mutations into Flink -- in other words, turn Flink into a replication endpoint for the database's change data capture (CDC) stream,

flink - how to use state as cache

阅读更多关于 flink - how to use state as cache

“Stream Processing with Apache Flink” how to run book code from IntelliJ?

阅读更多关于 “Stream Processing with Apache Flink” how to run book code from IntelliJ?

问题 As described in this post I have been unable to successfully run any code from the book "Stream Processing with Apache Flink, including the precompiled jar. It is not my practice to use an IDE but I thought I would try to use IntelliJ as Chapter 3 "Run and Debug Flink Applications in an IDE" describes how to do that specifically for the code for this book. The book describes a project import process that I have not found a way to use. It describes setting options on import, for example select

Can I have multiple subtasks of an operator in the same slot, in Flink?

阅读更多关于 Can I have multiple subtasks of an operator in the same slot, in Flink?

问题 I have been exploring Apache Flink for a few days, and I have some doubts about the concept of Task Slot. Although several questions have been asked about it, there is a point I don't get. I am using a toy application for testing, running a local cluster. I have disabled operator chaining I know from docs that slots allow for memory isolation and not CPU isolation. Reading the docs, it seems that a Task Slot is a Java thread. 1) When I deploy my application with parallelism=1, all the

Testing Flink with embedded Kafka

阅读更多关于 Testing Flink with embedded Kafka

问题 I have a simple Flink application, which sums up the events with the same id and timestamp within the last minute: DataStream<String> input = env .addSource(consumerProps) .uid("app"); DataStream<Event> events = input.map(record -> mapper.readValue(record, Event.class)); pixels .assignTimestampsAndWatermarks(new TimestampsAndWatermarks()) .keyBy("id") .timeWindow(Time.minutes(1)) .sum("constant") .addSink(simpleNotificationServiceSink); env.execute(jobName); private static class