使用praw时如何不从评论或提交中打印表情符号

问题描述 投票:0回答:1

当我尝试打印注释或带有表情符号的提交时收到错误消息。我怎么能不理会和只打印字母和数字?

使用Praw进行网络调试

top_posts2 = page.top(limit = 25)
for post in top_posts2:
   outputFile.write(post.title)
   outputFile.write('   ')
   outputFile.write(str(post.score))
   outputFile.write('\n')
   outputFile.write(post.selftext)
   outputFile.write('\n')

   submissions = reddit.submission(id = post.id)

   comment_page = submissions.comments
   top_comment = comment_page[0] #by default, this will be the best comment of the post

   commentBody = top_comment.body

   outputFile.write(top_comment.body)
   outputFile.write('\n')

我只想输出字母和数字。也许还有一些特殊字符(或全部)

python praw
1个回答
0
投票

您可以通过多种方式来执行此操作。我建议创建一种“文本清除”功能

def cleanText(text):
    new_text = ""
    for c in text:       # for each character in the text
        if c.isalnum():  # check if it is either a letter or number (alphanumeric)
            new_text += c
    return new_text

或者如果您想包括特定的非字母数字数字

def cleanText(text):
    valid_symbols = "!@#$%^&*()"    # <-- add whatever symbols you want here
    new_text = ""
    for c in text:       # for each character in the text
        if c.isalnum() or c in valid_symbols:  # check if alphanumeric or a valid symbol
            new_text += c
    return new_text

因此您可以在脚本中执行类似的操作

commentBody = cleanText(top_comment.body)
© www.soinside.com 2019 - 2024. All rights reserved.