I wanted to take advantage of the new BigQuery functionality of time partitioned tables, but am unsure this is currently possible in the 1.6 version of the Dataflow SDK.
As Pavan says, it is definitely possible to write to partition tables with Dataflow. Are you using the DataflowPipelineRunner
operating in streaming mode or batch mode?
The solution you proposed should work. Specifically, if you pre-create a table with date partitioning set up, then you can use a BigQueryIO.Write.toTableReference
lambda to write to a date partition. For example:
/**
* A Joda-time formatter that prints a date in format like {@code "20160101"}.
* Threadsafe.
*/
private static final DateTimeFormatter FORMATTER =
DateTimeFormat.forPattern("yyyyMMdd").withZone(DateTimeZone.UTC);
// This code generates a valid BigQuery partition name:
Instant instant = Instant.now(); // any Joda instant in a reasonable time range
String baseTableName = "project:dataset.table"; // a valid BigQuery table name
String partitionName =
String.format("%s$%s", baseTableName, FORMATTER.print(instant));