Find top n terms with highest TF-IDF score per class

前端 未结 3 1513
南笙
南笙 2021-01-23 05:59

Let\'s suppose that I have a dataframe with two columns in pandas which resembles the following one:

    text                                label
0         


        
3条回答
  •  刺人心
    刺人心 (楼主)
    2021-01-23 06:13

    top_terms = pd.DataFrame(columns = range(1,6))
    
    for i in term_doc_mat.index:
        top_terms.loc[len(top_terms)] = term_doc_mat.loc[i].sort_values(ascending = False)[0:5].index
    
    

    This will give you the top 5 terms for each document. Adjust as needed.

提交回复
热议问题