regex to extract mentions in Twitter

前端 未结 1 828
逝去的感伤
逝去的感伤 2020-12-22 06:04

I need to write a regex in python to extract mentions from Tweets.

My attempt:

regex=re.compile(r\"(?<=^|(?<=[^a-zA-Z0-9-_\\.]))@([A-Za-z]+[A-Z         


        
1条回答
  •  半阙折子戏
    2020-12-22 06:51

    Add an underscore to the last set like this:

    (?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9_]+)
    

    Regex101 Demo

    On a side note, Twitter Handle rules allow you to have usernames starting with numbers & underscores as well. So to extract twitter handles a regex could be as simple as: @\w{1,15} (allows characters, numbers and underscores and includes the 15 character limit). Will need some additional lookaheads/lookbehinds based on where the regex might be used.

    0 讨论(0)
提交回复
热议问题