pickle.PicklingError: args[0] from __newobj__ args has the wrong class with hadoop python

前端 未结 3 973
死守一世寂寞
死守一世寂寞 2021-01-13 08:44

I am trying to I am tring to delete stop words via spark,the code is as follow

from nltk.corpus import stopwords
from pyspark.context import SparkContext
fro         


        
3条回答
  •  萌比男神i
    2021-01-13 09:18

    It's to do with uploading of stop words module. As a work around import stopwords library with in the function itself. please see the similar issue linked below. I had same the issue and this work around fixed the problem.

        def stopwords_delete(word_list):
            from nltk.corpus import stopwords
            filtered_words=[]
            print word_list
    

    Similar Issue

    I would recommend from pyspark.ml.feature import StopWordsRemover as permanent fix.

提交回复
热议问题