需要获取实际字符串而不是该编码字符串。很少有主题适合字符串格式,但很少有这种编码格式,我不知道如何解决。
如何解码字符串并打印主题的解码部分?
FROM_EMAIL = "[email protected]"
FROM_PWD = "my Password"
SMTP_SERVER = "imap.gmail.com"
SMTP_PORT = 993
l=['Developer','Architect','NEED','Internship','Urgent']
def get_body(msg):
if msg.is_multipart():
return get_body(msg.get_payload(0))
else:
return msg.get_payload(None,True)
def readmail():
mail = imaplib.IMAP4_SSL(SMTP_SERVER)
mail.login(FROM_EMAIL,FROM_PWD)
mail.select('inbox')
type, data = mail.search(None, '(SINCE "20-May-2020" BEFORE "26-May-2020")')
mail_ids = data[0]
id_list = mail_ids.split()
id_list=id_list[::-1]
first_email_id = id_list[0]
latest_email_id = id_list[-1]
for byte_obj in id_list:
typ, data = mail.fetch(byte_obj, '(RFC822)' )
raw=email.message_from_bytes(data[0][1])
msg=get_body(raw)
s=''
s=raw['SUBJECT']
s1=raw['Date']
print(s)
readmail()
输出:
Winner announcement! Amazon Kindle Oasis.
[FREE WEBINAR] Natural Language Processing for Beginners
Godrej 24 | Get Rs. 2 Lakh Gold Voucher | 2 & 3 BHK at Rs. 83 Lakh*
=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?=
=?UTF-8?B?b3cgYXMg4oK5NDU1?=
Panda just uploaded a video
Vernix Gamerz just uploaded a video
您的大部分问题已在此处回答:
Find, decode and replace all base64 values in text file
关于您的例子:
部分主题行以base64
格式编码。
以下面的部分您的字符串s=raw['SUBJECT']
为例
=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?=
=?UTF-8?B?b3cgYXMg4oK5NDU1?=
结构如下:
首先您有:
?UTF-8?B?
然后是编码的字符串:
TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw
关注者
=?
将编码后的字符串从base64
转换为UTF-8
会得到文本:
Last day to save! Popular courses as l
您可以在https://www.base64decode.org/下进行验证
所以我建议做的是:首先分离字符串。取编码后的字符串并解码。
也许您首先将字符串s=raw['SUBJECT']
分为包含编码部分和未编码部分的部分。您可以使用这样的正则表达式来做到这一点:
(?:\=\?UTF-8\?B\?)([^=]*)(?:\=\?)
然后转换编码的部分,然后再将所有内容放在一起。