pandas read_csv() for multiple delimiters

前端 未结 1 1247
日久生厌
日久生厌 2020-12-16 17:36

I have a file which has data as follows

1000000 183:0.6673;2:0.3535;359:0.304;363:0.1835
1000001 92:1.0
1000002 112:1.0
1000003 154435:0.746;30:0.3902;220:0.         


        
相关标签:
1条回答
  • 2020-12-16 17:51

    From this question, Handling Variable Number of Columns with Pandas - Python, one workaround to pandas.errors.ParserError: Expected 29 fields in line 11, saw 45. is let read_csv know about how many columns in advance.

    my_cols = [str(i) for i in range(45)] # create some col names
    df_user_key_word_org = pd.read_csv(filepath+"user_key_word.txt",
                                       sep="\s+|;|:",
                                       names=my_cols, 
                                       header=None, 
                                       engine="python")
    # I tested with s = StringIO(text_from_OP) on my computer
    

    Hope this works.

    0 讨论(0)
提交回复
热议问题