Python IMAP: =?utf-8?Q? in subject string

前端 未结 5 952
忘掉有多难
忘掉有多难 2020-12-17 10:36

I am displaying new email with IMAP, and everything looks fine, except for one message subject shows as:

=?utf-8?Q?Subject?=

How ca

相关标签:
5条回答
  • 2020-12-17 11:02

    High level IMAP lib may be useful here: imap_tools

    from imap_tools import MailBox, AND
    
    # get list of email subjects from INBOX folder
    with MailBox('imap.mail.com').login('test@mail.com', 'pwd', 'INBOX') as mailbox:
        subjects = [msg.subject for msg in mailbox.fetch()]
    
    • Parsed email message attributes
    • Query builder for searching emails
    • Actions with emails: copy, delete, flag, move, seen
    • Actions with folders: list, set, get, create, exists, rename, delete, status
    • No dependencies
    0 讨论(0)
  • 2020-12-17 11:05

    In MIME terminology, those encoded chunks are called encoded-words. You can decode them like this:

    import email.Header
    text, encoding = email.Header.decode_header('=?utf-8?Q?Subject?=')[0]
    

    Check out the docs for email.Header for more details.

    0 讨论(0)
  • 2020-12-17 11:13

    This is a MIME encoded-word. You can parse it with email.header:

    import email.header
    
    def decode_mime_words(s):
        return u''.join(
            word.decode(encoding or 'utf8') if isinstance(word, bytes) else word
            for word, encoding in email.header.decode_header(s))
    
    print(decode_mime_words(u'=?utf-8?Q?Subject=c3=a4?=X=?utf-8?Q?=c3=bc?='))
    
    0 讨论(0)
  • 2020-12-17 11:21

    Try Imbox

    Because imaplib is a very excessive low level library and returns results which are hard to work with

    Installation

    pip install imbox

    Usage

    from imbox import Imbox
    
    with Imbox('imap.gmail.com',
            username='username',
            password='password',
            ssl=True,
            ssl_context=None,
            starttls=False) as imbox:
    
        all_inbox_messages = imbox.messages()
        for uid, message in all_inbox_messages:
            message.subject
    
    0 讨论(0)
  • 2020-12-17 11:28

    In Python 3.3+, the parsing classes and functions in email.parser automatically decode "encoded words" in headers if their policy argument is set to policy.default

    >>> import email
    >>> from email import policy
    
    >>> msg = email.message_from_file(open('message.txt'), policy=policy.default)
    >>> msg['from']
    'Pepé Le Pew <pepe@example.com>'
    

    The parsing classes and functions are:

    • email.parser.BytesParser
    • email.parser.Parser
    • email.message_from_bytes
    • email.message_from_binary_file
    • email.message_from_string
    • email.message_from_file

    Confusingly, up to at least Python 3.8, the default policy for these parsing functions is not policy.default, but policy.compat32, which does not decode "encoded words".

    >>> msg = email.message_from_file(open('message.txt'))
    >>> msg['from']
    '=?utf-8?q?Pep=C3=A9?= Le Pew <pepe@example.com>'
    
    0 讨论(0)
提交回复
热议问题