BigQuery writeTableRows Always writing to buffer

。_饼干妹妹 提交于 2019-12-01 14:42:42

The answer is you need to include "withNumFileShards" like so (Can be 1 to 1000).

        p.begin()
            .apply("Input", PubsubIO.readAvros(DataStructure.class).fromTopic("topicName"))
            .apply("Transform", ParDo.of(new CustomTransformFunction()))
            .apply("Write", BigQueryIO.writeTableRows()
                    .to(table)
                    .withSchema(schema)
                    .withTimePartitioning(timePartitioning)
            .withMethod(Method.FILE_LOADS)
            .withTriggeringFrequency(org.joda.time.Duration.standardMinutes(2))
            .withNumFileShards(1000)
            .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
            .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND));
        p.run().waitUntilFinish();

I can't find this documented anywhere to say that withNumFileShards is mandatory however there is a Jira ticket for this which I found after the fix.

https://issues.apache.org/jira/browse/BEAM-3198

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!