Can we deduplicate emails retrieved through IMAP by hash?

邮差的信 提交于 2019-12-12 00:01:27

问题


I'm trying to achieve at-most-once processing of email messages retrieved over IMAP. (I asked a related question about it.)

Is it reliable to compute a cryptographic hash code of the MIME messages retrieved over IMAP to deduplicate them?

In other words, why would the same email result in a different result when retrieved over IMAP multiple times? Can an email change it's contents for example when it's moved across folders, or marked as read or for some other reason?

I'm using hMailserver on Windows with Mailkit.NET as the client. Not sure this matters, though.


回答1:


Many mailing lists append a footer, so mail sent both to me and a list arrives with two different signatures.

Most people consider this to be one message.

I suggest using the message-id header field for at-most-once processing. AFAICT it's been reliably unique for the last ten years (the last collision I've seen was from around 2000).



来源:https://stackoverflow.com/questions/37774853/can-we-deduplicate-emails-retrieved-through-imap-by-hash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!