“missing word in phrase: charset not supported”, when using the mail package

走远了吗. 提交于 2019-12-01 00:52:47

I hope this helps someone who may consider Go to process emails(i.e develop client apps). It seems the standard Go standard library is not mature enough for email processing. It doesn't handle multi-part, different char sets etc. After almost a day trying different hacks and packages I've decided to just throw the go code away and use an old good JavaMail solution.

Alexey Vasiliev's MIT-licensed http://github.com/le0pard/go-falcon/ includes a parser package that applies whichever encoding package is needed to decode the headers (the meat is in utils.go).

package main

import (
        "bufio"
        "bytes"
        "fmt"
        "net/textproto"
        "github.com/le0pard/go-falcon/parser"
)

var msg = []byte(`Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?=
 =?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?=

`)


func main() {
        tpr := textproto.NewReader(bufio.NewReader(bytes.NewBuffer(msg)))
        mh, err := tpr.ReadMIMEHeader()
        if err != nil {
                panic(err)
        }
        for name, vals := range mh {
                for _, val := range vals {
                        val = parser.MimeHeaderDecode(val)
                        fmt.Print(name, ": ", val, "\n")
                }
        }
}

It looks like its parser.FixEncodingAndCharsetOfPart is used by the package to decode/convert content as well, though with a couple of extra allocations caused by converting the []byte body to/from a string. If you don't find the API works for you, you might at least be able to use the code to see how it can be done.

Found via godoc.org's "...and is imported by 3 packages" link from encoding/simplifiedchinese -- hooray godoc.org!

I've been using github.com/jhillyerd/enmime which seems to have no trouble with this. It'll parse out both headers and body content. Given an io.Reader r:

// Parse message body
env, _ := enmime.ReadEnvelope(r)
// Headers can be retrieved via Envelope.GetHeader(name).
fmt.Printf("From: %v\n", env.GetHeader("From"))
// Address-type headers can be parsed into a list of decoded mail.Address structs.
alist, _ := env.AddressList("To")
for _, addr := range alist {
  fmt.Printf("To: %s <%s>\n", addr.Name, addr.Address)
}
fmt.Printf("Subject: %v\n", env.GetHeader("Subject"))

// The plain text body is available as mime.Text.
fmt.Printf("Text Body: %v chars\n", len(env.Text))

// The HTML body is stored in mime.HTML.
fmt.Printf("HTML Body: %v chars\n", len(env.HTML))

// mime.Inlines is a slice of inlined attacments.
fmt.Printf("Inlines: %v\n", len(env.Inlines))

// mime.Attachments contains the non-inline attachments.
fmt.Printf("Attachments: %v\n", len(env.Attachments))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!