How to get word details from TF Vector RDD in Spark ML Lib?
问题 I have created Term Frequency using HashingTF in Spark. I have got the term frequencies using tf.transform for each word. But the results are showing in this format. [<hashIndexofHashBucketofWord1>,<hashIndexofHashBucketofWord2> ...] ,[termFrequencyofWord1, termFrequencyOfWord2 ....] eg: (1048576,[105,3116],[1.0,2.0]) I am able to get the index in hash bucket, using tf.indexOf(\"word\") . But, how can I get the word using the index? 回答1: Well, you can't. Since hashing is non-injective there