ImportError: No module named numpy on spark workers

后端 未结 6 1228
抹茶落季
抹茶落季 2020-12-05 03:12

Launching pyspark in client mode. bin/pyspark --master yarn-client --num-executors 60 The import numpy on the shell goes fine but it fails in the kmeans. Someho

6条回答
  •  温柔的废话
    2020-12-05 03:51

    You have to be aware that you need to have numpy installed on each and every worker, and even the master itself (depending on your component placement)

    Also ensure to launch pip install numpy command from a root account (sudo does not suffice) after forcing umask to 022 (umask 022) so it cascades the rights to Spark (or Zeppelin) User

提交回复
热议问题