How do I use MapElements and KV in together in Apache Beam?

末鹿安然 提交于 2019-12-08 03:18:56

问题


I wanted to do something like:

PCollection<String> a = whatever;
PCollection<KV<String, User>> b = a.apply(
        MapElements.into(TypeDescriptor.of(KV<String, User>.class))
        .via(s -> KV.of(s, new User(s))));

Where User is a custom datatype with Arvo coder and a constructor that takes a string into account.

However, I get the following error:

Cannot select from parameterized type

I tried to change it to TypeDescriptor.of(KV.class) instead, but then I get:

Incompatible types; Required PCollection> but 'apply' was inferred to OutputT: no instance(s) of type variable(s) exists so that PCollection conforms to PCollection>

So how am I suppose to use KV with MapElements?

I know that what I want to do is doable using ParDo where I could explicitly specify how to do Type Erasure by declearing new DoFn<String, KV<String, User>> but ParDo does not support lambda function. As we are using Java 8, this seems less elegant....


回答1:


Due to type erasure in Java during compilation, KV<String, User>.class is transformed into KV.class and at runtime KV.class isn't enough information to infer a coder since the type variables have been erased.

To get around this limitation, you need to use a mechanism which preserves type information after compilation. For example you could use:

TypeDescriptors.kvs(TypeDescriptors.strings(), TypeDescriptor.of(User.class))

which is the same as providing your own anonymous class:

new TypeDescriptor<KV<String, User>> {}

Providing anonymous classes with type variables bound is one of the ways to get around type erasure in Java currently.



来源:https://stackoverflow.com/questions/53235553/how-do-i-use-mapelements-and-kv-in-together-in-apache-beam

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!