Apache beam parsing data flow pub/sub into a dictionary
问题 I am running a streaming pipeline using beam / dataflow. I am reading my input from pub/sub as converting into a dict as below: raw_loads_dict = (p | 'ReadPubsubLoads' >> ReadFromPubSub(topic=PUBSUB_TOPIC_NAME).with_output_types(bytes) | 'JSONParse' >> beam.Map(lambda x: json.loads(x)) ) Since this is done on each element of a high throughput pipeline I am worried that this is not the most efficient way to do this? What is the best practice in this case, considering I am then manipulating the