parsing email contents from poplib with email module (PYTHON)

依然范特西╮ 提交于 2019-12-08 03:25:40

问题


PYTHON VERSION == 3.5

code:

import getpass, poplib, email
Mailbox = poplib.POP3_SSL('pop.googlemail.com', '995')
Mailbox.user("email_here@gmail.com")
Mailbox.pass_('password_here')
numMessages = len(Mailbox.list()[1])
for i in range(numMessages):
    info  = b" ".join(Mailbox.retr(i+1)[1])
    msg = email.message_from_bytes(info)
    print(msg.keys())

output:

['MIME-Version']
['MIME-Version']
['MIME-Version']
['Delivered-To']
['Delivered-To']
['Delivered-To']
['Delivered-To']
['Delivered-To']
['Delivered-To']
['Delivered-To']
['Delivered-To']

the output isn't correct because there should be more fields from the msg other than "MIME-Version" and "Delivered-To" I thought

email.message_from_bytes() parses the contents of a byte string

is msg not a byte string?

the docs recommend this:

M = poplib.POP3('localhost')
M.user(getpass.getuser())
M.pass_(getpass.getpass())
numMessages = len(M.list()[1])
for i in range(numMessages):
    for j in M.retr(i+1)[1]:
        print(j)

Is there a way to parse the returned message using the email module? so we can store the email details. like sender, body, header etc.


回答1:


the answer turned out to be fairly easy

import getpass, poplib, email
Mailbox = poplib.POP3_SSL('pop.googlemail.com', '995')
Mailbox.user("email_here@gmail.com")
Mailbox.pass_('password_here')
numMessages = len(Mailbox.list()[1])
for i in range(numMessages):
    raw_email  = b"\n".join(Mailbox.retr(i+1)[1])
    parsed_email = email.message_from_bytes(raw_email)
    print(parsed_email.keys())

instead of joining raw_email with a space just join it by a \n and the email module can parse the fields correctly:

also an a awesome thing about using the email module is when you call email.message_from_bytes() the output returned is a dict

so you access the fields like this:

raw_email  = b"\n".join(Mailbox.retr(i+1)[1])
parsed_email = email.message_from_bytes(raw_email)
print(parsed_email["header"])

but what if the field doesn't exist?:

raw_email  = b"\n".join(Mailbox.retr(i+1)[1])
parsed_email = email.message_from_bytes(raw_email)
print(parsed_email["non-existent field"])

the above code will return None and not throw a KeyError



来源:https://stackoverflow.com/questions/35679338/parsing-email-contents-from-poplib-with-email-module-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!