pandas split list into columns with regex

后端 未结 2 998
礼貌的吻别
礼貌的吻别 2020-12-11 17:40

I have a string list:

content
01/09/15, 10:07 - message1
01/09/15, 10:32 - message2
01/09/15, 10:44 - message3

I want a data frame, like:

相关标签:
2条回答
  • 2020-12-11 17:52

    You can use str.extract - where named groups can become column names

    In [5827]: df['content'].str.extract('(?P<date>[\s\S]+) - (?P<message>[\s\S]+)', 
                                         expand=True)
    Out[5827]:
                  date   message
    0  01/09/15, 10:07  message1
    1  01/09/15, 10:32  message2
    2  01/09/15, 10:44  message3
    

    Details

    In [5828]: df
    Out[5828]:
                          content
    0  01/09/15, 10:07 - message1
    1  01/09/15, 10:32 - message2
    2  01/09/15, 10:44 - message3
    
    0 讨论(0)
  • 2020-12-11 17:57

    Use str.split by \s+-\s+ - \s+ is one or more whitespaces:

    df[['date','message']] = df['content'].str.split('\s+-\s+', expand=True)
    print (df)
                          content             date   message
    0  01/09/15, 10:07 - message1  01/09/15, 10:07  message1
    1  01/09/15, 10:32 - message2  01/09/15, 10:32  message2
    2  01/09/15, 10:44 - message3  01/09/15, 10:44  message3
    

    If need remove content column add DataFrame.pop:

    df[['date','message']] = df.pop('content').str.split('\s+-\s+', expand=True)
    
    print (df)
                  date   message
    0  01/09/15, 10:07  message1
    1  01/09/15, 10:32  message2
    2  01/09/15, 10:44  message3
    
    0 讨论(0)
提交回复
热议问题