Dataflow: No Worker Activity

大憨熊 提交于 2019-12-11 17:00:03

问题


I'm having a few problems running a relatively vanilla Dataflow job from an AI Platform Notebook (the job is meant to take data from BigQuery > cleanse and prep > write to a CSV in GCS):

options = {'staging_location': '/staging/location/',
           'temp_location': '/temp/location/',
           'job_name': 'dataflow_pipeline_job',
           'project': PROJECT,
           'teardown_policy': 'TEARDOWN_ALWAYS',
           'max_num_workers': 3,
           'region': REGION,
           'subnetwork': 'regions/<REGION>/subnetworks/<SUBNETWORK>',
           'no_save_main_session': True}
opts = beam.pipeline.PipelineOptions(flags=[], **options)  
p = beam.Pipeline('DataflowRunner', options=opts)
(p 
 | 'read' >> beam.io.Read(beam.io.BigQuerySource(query=selquery, use_standard_sql=True))
 | 'csv' >> beam.FlatMap(to_csv)
 | 'out' >> beam.io.Write(beam.io.WriteToText('OUTPUT_DIR/out.csv')))
p.run()

Error returned from stackdriver:

Workflow failed. Causes: The Dataflow job appears to be stuck because no worker activity has been seen in the last 1h. You can get help with Cloud Dataflow at https://cloud.google.com/dataflow/support.

Following warning:

S01:eval_out/WriteToText/Write/WriteImpl/DoOnce/Read+out/WriteToText/Write/WriteImpl/InitializeWrite failed.

Unfortunately not much else other than that. Other things to note:

  • The job ran locally without any error
  • The network is running in custom mode but is the default network
  • Python Version == 3.5.6
  • Python Apache Beam version == 2.16.0
  • The AI Platform Notebook is infact a GCE instance with a Deep Learning VM image deployed on top (with a container optimised OS), we have then used port forwarding to access the Jupyter environment
  • The service account requesting the job (Compute Engine default service account) has the necessary permissions required to complete this
  • Notebook instance, dataflow job, GCS bucket are all in europe-west1
  • I've also tried running this on a standard AI Platform Notebook and still the same problem.

Any help would be much appreciated! Please let me know if there is any other info I can provide which will help.


I've realised that my error is the same as the following:

Why do Dataflow steps not start?

The reason my job has gotten stuck is because the write to gcs step runs first even though it is meant to run last. Any ideas on how to fix this?

来源:https://stackoverflow.com/questions/58827530/dataflow-no-worker-activity

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!