Dataflow: streaming Windmill RPC errors for a stream

两盒软妹~` 提交于 2021-02-11 12:35:38

问题


My beam dataflow try to read data from GCS and write data to Pub/Sub.

However, the pipeline is hang with following error

{
  job: "2019-11-04_03_53_38-5223486841492484115"   
  logger: "org.apache.beam.runners.dataflow.worker.windmill.GrpcWindmillServer"   
  message: "20 streaming Windmill RPC errors for a stream, last was: org.apache.beam.vendor.grpc.v1p21p0.io.grpc.StatusRuntimeException: ABORTED: The operation was aborted. with status Status{code=ABORTED, description=The operation was aborted., cause=null}"   
  thread: "36"   
  worker: "gcs-to-pubsub-job14-11040353-a72j-harness-xrg3"   
 }

What cause this error? How to fix it?

The firewall rule config as

gcloud compute firewall-rules create data-flow-test-firewall \
    --network dataflow-test \
    --action allow \
    --direction ingress \
    --target-tags dataflow \
    --source-tags dataflow \
    --priority 0 \
    --rules tcp:12345-12346

and dataflow start parameters

-Dexec.mainClass=com.beam.test.beamPubSubV2 -Dexec.args="--project=pid  
--runner=DataflowRunner --stagingLocation=gs://bucket/stage/ 
--tempLocation=gs://bucket/temp/ --jobName=gcs-to-pubsub-job14 
--network=dataflow-test  --enableStreamingEngine --maxNumWorkers=15 
--autoscalingAlgorithm=THROUGHPUT_BASED" -Pdataflow-runner

Beam version: 2.16.0

来源:https://stackoverflow.com/questions/58693680/dataflow-streaming-windmill-rpc-errors-for-a-stream

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!