Changing charset when retrieving messages from mail server!

时间秒杀一切 提交于 2019-12-24 11:36:04

问题


i'm currently creating a little mail client and facing a problem with charset. I use indy's TIdIMAP4 component to retrieve data from mail-server. When i try to retrieve mail bodies then accent letters like ä, ü etc are converted to =E4, =FC respectively as it is using charset ISO-8859-1.

Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable

How can i make server to send me data in another charset, like utf-8? What would be the best solution for that problem?

Thanks in advance!


回答1:


It is not the charset that is producing strings like =E4 and =FC, it is the Content-Transfer-Encoding instead. $E4 and $FC are the binary representations of ä and ü in ISO-8859-1, but they are 8-bit values. Email is still largely a 7-bit environment. Unless both clients and servers negotiate 8-bit transfers during their communications, then byte octets above $7F have to be encoded in a 7-bit compatible manner to pass through email gateways safely, especially legacy ones that still exist. quoted-printable is a commonly used 7-bit byte encoding in email for textual content. base64 is another one, but it is not human-readible so it tends to be used for binary data instead of textual data (though it can be used for text).

In any case, you cannot make the server deliver the email data to you in another encoding. The server is merely delivering the original email data as-is that was originally delivered to it by the sender. If you want the data in UTF-8, then you have to re-encode it yourself after downloading it. Indy will handle the decoding for you.



来源:https://stackoverflow.com/questions/5584204/changing-charset-when-retrieving-messages-from-mail-server

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!