Convert Apache Flink Datastream to a Datastream that makes tumbling windows of 2 events and sum on a value

删除回忆录丶 提交于 2019-12-11 16:59:42

问题


I have a Flink Table with the following columns: final String[] hNames = {"mID", "dateTime", "mValue", "unixDateTime", "mType"}; I want to create a DataStream in Apache Flink that makes tumbling windows of a length of 2 each and calculates the average mValue for that window. Below I've used the SUM function since it seems there isnt a AVG function. These windows must be grouped on the mID (is a Integer) or dateTime column. I key the windows by the column mType, since these represent a specific group of data.

Another issue that I have is that the data I use in this app is from a CSV file. So its not real time data. The problem is that Flink randomly order this data. I want it to be sorted ascending on the mID or dateTime column.

The code I have below does not print anything. What am I doing wrong here? The weird thing is when I replace countWindow() function with countWindowAll() then I do get output.

final String[] hColumnNames = {"mID", "dateTime", "mValue", "unixDateTime", "mType"};

  StreamExecutionEnvironment.getExecutionEnvironment();
 StreamTableEnvironment tableEnv = StreamTableEnvironment.create(fsEnv);   


        TableSource csvSource = CsvTableSource.builder()
                .path("path")
                .fieldDelimiter(";")
                .field(hColumnNames[0], Types.INT())
                .field(hColumnNames[1], Types.SQL_TIMESTAMP())
                .field(hColumnNames[2], Types.DOUBLE())
                .field(hColumnNames[3], Types.LONG())
                .field(hColumnNames[4], Types.STRING())
                .build();

        //Register the TableSource 
        tableEnv.registerTableSource("H", csvSource);
        Table HTable = tableEnv.scan("H");
        tableEnv.registerTable("HTable", HTable);

        DataStream<Row> stream = tableEnv.toAppendStream(HTable, Row.class);

        TupleTypeInfo<Tuple5<Integer, Timestamp, Double, Long, String>> tupleType = new TupleTypeInfo<>(
        Types.INT(),
        Types.SQL_TIMESTAMP(),
        Types.DOUBLE(),
        Types.LONG(),
        Types.STRING());
        DataStream<Tuple5<Integer, Timestamp, Double, Long, String>> dsTuple =
                tableEnv.toAppendStream(HTable, tupleType);


//What is going wrong below???
        DataStream<Tuple5<Integer, Timestamp, Double, Long, String>> dsTuple1 = dsTuple
                .keyBy(4)
                .countWindow(2)
                .sum(3)
                ;

        try {
            fsEnv.execute();
        } catch (Exception e) {
            e.printStackTrace();
        }

来源:https://stackoverflow.com/questions/58487582/convert-apache-flink-datastream-to-a-datastream-that-makes-tumbling-windows-of-2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!