面对问题以解码?UTF-8?B?ZnVjayDwn5CO?=!输入主题。使用IMAP和Python

问题描述 投票:0回答:1

需要获取实际字符串而不是该编码字符串。很少有主题适合字符串格式,但很少有这种编码格式,我不知道如何解决。

如何解码字符串并打印主题的解码部分?


FROM_EMAIL  = "[email protected]"
FROM_PWD    = "my Password"
SMTP_SERVER = "imap.gmail.com"
SMTP_PORT   = 993
l=['Developer','Architect','NEED','Internship','Urgent']
def get_body(msg):
    if msg.is_multipart():
        return get_body(msg.get_payload(0))
    else:
        return msg.get_payload(None,True)
def readmail():
    mail = imaplib.IMAP4_SSL(SMTP_SERVER)
    mail.login(FROM_EMAIL,FROM_PWD)
    mail.select('inbox')
    type, data = mail.search(None, '(SINCE "20-May-2020" BEFORE "26-May-2020")')
    mail_ids = data[0]
    id_list = mail_ids.split()
    id_list=id_list[::-1]
    first_email_id = id_list[0]
    latest_email_id = id_list[-1]
    for byte_obj in id_list:
        typ, data = mail.fetch(byte_obj, '(RFC822)' )
        raw=email.message_from_bytes(data[0][1])
        msg=get_body(raw)
        s='' 

        s=raw['SUBJECT']
        s1=raw['Date']
        print(s)
readmail()

输出:

Winner announcement!  Amazon Kindle Oasis.

[FREE WEBINAR] Natural Language Processing for Beginners

Godrej 24 | Get Rs. 2 Lakh Gold Voucher | 2 & 3 BHK at Rs. 83 Lakh*

=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?=
        =?UTF-8?B?b3cgYXMg4oK5NDU1?=

Panda just uploaded a video

Vernix Gamerz just uploaded a video
python python-3.x email gmail imap
1个回答
0
投票

您的大部分问题已在此处回答:

Find, decode and replace all base64 values in text file

关于您的例子:

部分主题行以base64格式编码。

以下面的部分您的字符串s=raw['SUBJECT']为例

=?UTF-8?B?TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw=?= =?UTF-8?B?b3cgYXMg4oK5NDU1?=

结构如下:

首先您有:

?UTF-8?B?

然后是编码的字符串:

TGFzdCBkYXkgdG8gc2F2ZSEgUG9wdWxhciBjb3Vyc2VzIGFzIGw

关注者

=?

将编码后的字符串从base64转换为UTF-8会得到文本:

Last day to save! Popular courses as l

您可以在https://www.base64decode.org/下进行验证

所以我建议做的是:首先分离字符串。取编码后的字符串并解码。

也许您首先将字符串s=raw['SUBJECT']分为包含编码部分和未编码部分的部分。您可以使用这样的正则表达式来做到这一点:

(?:\=\?UTF-8\?B\?)([^=]*)(?:\=\?)

然后转换编码的部分,然后再将所有内容放在一起。

© www.soinside.com 2019 - 2024. All rights reserved.