Print Kafka Stream Input out to console?

北战南征 提交于 2019-11-28 07:39:57
Matthias J. Sax

If you use Kafka Streams, you need to apply functions/operators on your data streams. In your case, you create a KStream object, thus, you want to apply an operator to source.

Depending on what you want to do, there are operators that apply a function to each record in the stream independently (eg. map()), or other operators that apply a function to multiple record together (eg. aggregateByKey()). You should have a look into the documentation: http://docs.confluent.io/3.0.0/streams/developer-guide.html#kafka-streams-dsl and examples https://github.com/confluentinc/kafka-streams-examples

Thus, you never create local variables using Kafka Streams as you show in your example above, but rather embed everything in operators/functions that get chained together.

For example, if you want to print all input record to stdout, you could do

KStream<String, String> source = builder.stream(stringSerde, stringSerde, "in-stream");
source.foreach(new ForeachAction<String, String>() {
    void apply(String key, String value) {
        System.out.println(key + ": " + value);
    }
 });

Thus, after you start your application via streams.start(), it will consumer the records from you input topic and for each record of your topic, a call to apply(...) is done, which prints the record on stdout.

Of course, a more native way for printing the stream to the console would be to use source.print() (which internally is basically the same as the shown foreach() operator with an already given ForeachAction.)

For your example with assigning the string to a local variable, you would need to put your code into apply(...) and do your regex-stuff etc. there to "extract the 3 letter keywords".

The best way to express this, would however be via a combination of flatMapValues() and print() (ie, source.flatMapValues(...).print()). flatMapValues() is called for each input record (in your case, I assume key will be null so you can ignore it). Within your flatMapValue function, you apply your regex and for each match, you add the match to a list of values that you finally return.

source.flatMapValues(new ValueMapper<String, Iterable<String>>() {
    @Override
    public Iterable<String> apply(String value) {
        ArrayList<String> keywords = new ArrayList<String>();

        // apply regex to value and for each match add it to keywords

        return keywords;
    }
}

The output of flatMapValues will be a KStream again, containing a record for each found keyword (ie, the output stream is a "union" over all lists your return in ValueMapper#apply()). Finally, you just print your result to console via print(). (Of course, you could also use a single foreach instead of flatMapValue+print but this would be less modular.)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!