Aparently, encoding japanese emails is somewhat challenging, which I am slowly discovering myself. In case there are any experts (even those with limited experience will do)
something like this should get the job done in python:
#!/usr/bin/python
# -*- mode: python; coding: utf-8 -*-
import smtplib
from email.MIMEText import MIMEText
from email.Header import Header
from email.Utils import formatdate
def send_from_gmail( from_addr, to_addr, subject, body, password, encoding="iso-2022-jp" ):
msg = MIMEText(body.encode(encoding), 'plain', encoding)
msg['Subject'] = Header(subject.encode(encoding), encoding)
msg['From'] = from_addr
msg['To'] = to_addr
msg['Date'] = formatdate()
s = smtplib.SMTP('smtp.gmail.com', 587)
s.ehlo(); s.starttls(); s.ehlo()
s.login(from_addr, password)
s.sendmail(from_addr, to_addr, msg.as_string())
s.close()
return "Sent mail to: %s" % to_addr
if __name__ == "__main__":
import sys
for n,item in enumerate(sys.argv):
sys.argv[n] = sys.argv[n].decode("utf8")
if len(sys.argv)==6:
print send_from_gmail( sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5] )
elif len(sys.argv)==7:
print send_from_gmail( sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5], encoding=sys.argv[6] )
else:
raise "SYNTAX: %s <from_addr> <to_addr> <subject> <body> <password> [encoding]"
**blatantly stolen/adapted from:
http://mtokyo.blog9.fc2.com/blog-entry-127.html
I have some experience composing and sending email in japanese...Normally you have to beware what encoding used for operating system and how you store your japanese strings! My Mail objects are normally encoded as follows:
string s = "V‚µ‚¢ŠwK–@‚Ì‚²’ñˆÄ"; // Our japanese are shift-jis encoded, so it appears like garbled
MailMessage message = new MailMessage();
message.BodyEncoding = Encoding.GetEncoding("iso-2022-jp");
message.SubjectEncoding = Encoding.GetEncoding("iso-2022-jp");
message.Subject = s.ToEncoding(Encoding.GetEncoding("Shift-Jis")); // Change the encoding to whatever your source is
message.Body = s.ToEncoding(Encoding.GetEncoding("Shift-Jis")); // Change the encoding to whatever your source is
Then i have an extension method to which does the conversion for me:
public static string ToEncoding(this string s, Encoding targetEncoding)
{
return s == null ? null : targetEncoding.GetString(Encoding.GetEncoding(1252).GetBytes(s)); //1252 is the windows OS codepage
}
Introduction of Japanese encoding to e-mail happened at JUNET(UUCP based nation-wide network) in early 90's.
At that time, RFC1468 was defined. If you follow RFC1468 in plain text mail, there would be no problem.
If you want to handle html mail, RFC1468 is useless except for header parts.
Here's what I use to send Japanese emails. Subject line looks fine in Outlook 2010, gmail and on iPhone.
Encoding encoding = Encoding.GetEncoding("iso-2022-jp");
byte[] bytes = encoding.GetBytes(subject);
string uuEncoded = Convert.ToBase64String(bytes);
subject = "=?iso-2022-jp?B?" + uuEncoded + "?=";
// not sure this is actually necessary...
mailMessage.SubjectEncoding = Encoding.GetEncoding("iso-2022-jp");
<?php
function sendMail($to, $subject, $body, $from_email,$from_name)
{
$headers = "MIME-Version: 1.0 \n" ;
$headers .= "From: " .
"".mb_encode_mimeheader (mb_convert_encoding($from_name,"ISO-2022-JP","AUTO")) ."" .
"<".$from_email."> \n";
$headers .= "Reply-To: " .
"".mb_encode_mimeheader (mb_convert_encoding($from_name,"ISO-2022-JP","AUTO")) ."" .
"<".$from_email."> \n";
$headers .= "Content-Type: text/plain;charset=ISO-2022-JP \n";
/* Convert body to same encoding as stated
in Content-Type header above */
$body = mb_convert_encoding($body, "ISO-2022-JP","AUTO");
/* Mail, optional parameters. */
$sendmail_params = "-f$from_email";
mb_language("ja");
$subject = mb_convert_encoding($subject, "ISO-2022-JP","AUTO");
$subject = mb_encode_mimeheader($subject);
$result = mail($to, $subject, $body, $headers, $sendmail_params);
return $result;
}
First of all you should be using:
Encoding.GetEncoding("ISO-2022-JP")
to convert your subject line into bytes that will be processed by Convert.ToBase64String().
=?ISO-2022-JP?B?TEXTTEXT...?= tells the receiving mail client which encoding was used on the sender's side to convert japanese "letters" into a byte stream.
Currently you're using UTF-16 to encode, but specifying ISO-2022-JP to decode. These are obviously two different encodings, I guess, just like ISO-8859-1 is different from Unicode (most extended western-europe chars are represented by one byte in ISO-XXX, but two bytes in Unicode).
I'm not sure what you mean about UTF-8 being second-class citizen. As long as the receiving mail client understands UTF-8 and is able to convert it to the current japanese locale, everything is fine.