How to read UTF-8 files with Pandas?

后端 未结 3 671
傲寒
傲寒 2020-12-14 08:32

I have a UTF-8 file with twitter data and I am trying to read it into a Python data frame but I can only get an \'object\' type instead of unicode strings:

#         


        
3条回答
  •  情话喂你
    2020-12-14 09:16

    As the other poster mentioned, you might try:

    df = pd.read_csv('1459966468_324.csv', encoding='utf8')
    

    However this could still leave you looking at 'object' when you print the dtypes. To confirm they are utf8, try this line after reading the CSV:

    df.apply(lambda x: pd.lib.infer_dtype(x.values))
    

    Example output:

    args            unicode
    date         datetime64
    host            unicode
    kwargs          unicode
    operation       unicode
    

提交回复
热议问题