How to strip color codes used by mIRC users?

前端 未结 6 1529
小鲜肉
小鲜肉 2020-12-16 06:15

I\'m writing a IRC bot in Python using irclib and I\'m trying to log the messages on certain channels.
The issue is that some mIRC users and some Bots write using color

6条回答
  •  温柔的废话
    2020-12-16 07:11

    Regular expressions are your cleanest bet in my opinion. If you haven't used them before, this is a good resource. For the full details on Python's regex library, go here.

    import re
    regex = re.compile("\x03(?:\d{1,2}(?:,\d{1,2})?)?", re.UNICODE)
    

    The regex searches for ^C (which is \x03 in ASCII, you can confirm by doing chr(3) on the command line), and then optionally looks for one or two [0-9] characters, then optionally followed by a comma and then another one or two [0-9] characters.

    (?: ... ) says to forget about storing what was found in the parenthesis (as we don't need to backreference it), ? means to match 0 or 1 and {n,m} means to match n to m of the previous grouping. Finally, \d means to match [0-9].

    The rest can be decoded using the links I refer to above.

    >>> regex.sub("", "blabla \x035,12to be colored text and background\x03 blabla")
    'blabla to be colored text and background blabla'
    

    chaos' solution is similar, but may end up eating more than a max of two numbers and will also not remove any loose ^C characters that may be hanging about (such as the one that closes the colour command)

提交回复
热议问题