Python imaplib:正确显示非ASCII字符
我正在使用Python 3.5和 imaplib
从GMail提取电子邮件并打印其正文.正文包含非ASCII字符.这些以奇怪的方式被编码",我无法找到解决方法.
I am using Python 3.5 and imaplib
to fetch an e-mail from GMail and print its body. The body contains non-ASCII characters.
These are 'encoded' in a strange way and I cannot find out how to fix this.
import email
import imaplib
c = imaplib.IMAP4_SSL('imap.gmail.com')
c.login('example@gmail.com', 'password')
c.select('Inbox')
_, data = c.fetch(b'12345', '(RFC822)')
mail = data[0][1]
message = email.message_from_bytes(mail)
payload = message.get_payload()
body = mail[0].as_string()
print(body)
给予
>> ... Mit freundlichen Gr=C3=BC=C3=9Fen ...
而不是期望的
>> ... Mit freundlichen Grüßen ...
在我看来,这不是编码问题,而是转换问题.但是,如何告诉Python正确转换字符?有更方便的图书馆吗?
It looks to me like this is not an issue of encoding but one of conversion. But how do I tell Python to convert the characters correctly? Is there a more convenient library?
文本使用引号进行编码-printable encoding ,这是一种在ASCII文本中对非ASCII字符进行编码的方法.您可以使用python的 quopri 模块对其进行解码.
The text is encoded with quoted-printable encoding, which is a way to encode non-ascii characters in ascii text. You can decode it using python's quopri module.
>>> import quopri
>>> bs = b'Gr=C3=BC=C3=9Fen'
>>> # Decode quoted-printable to raw bytes.
>>> utf8 = quopri.decodestring(bs)
>>> # Decode bytes to text.
>>> s = utf8.decode('utf-8')
>>> print(s)
Grüßen
您可能会发现 quoted-printable
是电子邮件的 content-transfer-encoding
标头的值.
You may find that quoted-printable
is the value of the email's content-transfer-encoding
header.