Python 脚本(BeautifulSoup)返回 NoneType

问题描述 投票:0回答:0

尝试解析同义词和反义词的 Merriam-Webster 词库条目的源代码。下面是使用

kind
这个词的源代码示例:view-source

尝试从此元标记下提取同义词和反义词列表(第 18 行):

<meta name="description" content="Synonyms for KIND: type, sort, genre, variety, stripe, breed, nature, like; Antonyms of KIND: thoughtless, unthinking, unkind, inconsiderate, heedless, uncaring, inattentive, inhospitable">

元标记似乎返回 NoneType。我猜这是因为它找不到

content
子对象...?不确定。

代码的最小可重现示例:

import requests
from bs4 import BeautifulSoup

# Get the HTML content of the Merriam-Webster thesaurus page for the word "kind"
response = requests.get('https://www.merriam-webster.com/thesaurus/kind')
soup = BeautifulSoup(response.content, 'html.parser')

# Extract the synonyms and antonyms for the word "kind"
# Find the meta tag containing synonyms and antonyms and extract its content
meta_tag = soup.find("meta", attrs={"name": "description"})
content = meta_tag["content"]

# Convert the content to a list of synonyms and antonyms
syn_ant_list = content.split(";")
synonyms_str = syn_ant_list[0].split(":")[1].strip()
antonyms_str = syn_ant_list[1].split(":")[1].strip()

# Print the results
print(f"Synonyms: {synonyms_str}")
print(f"Antonyms: {antonyms_str}")

这是完整的代码,尽管它也应该做其他事情。

import openpyxl
import requests
from bs4 import BeautifulSoup

# Open the input file and select the "input" sheet
input_wb = openpyxl.load_workbook('input2.xlsx')
input_ws = input_wb['Sheet1']

# Get the list of comparisons to search for
comparisons = [row[1] for row in input_ws.iter_rows(min_row=2, values_only=True)]

# Iterate over the comparisons and extract the synonyms and antonyms for each one
for i, comparison in enumerate(comparisons, start=2):
    # Initialize lists to store the total synonyms and antonyms for all responses
    all_synonyms = []
    all_antonyms = []

    # Get the HTML content of the Merriam-Webster thesaurus page for the comparison word
    response = requests.get(f'https://www.merriam-webster.com/thesaurus/{comparison}')
    soup = BeautifulSoup(response.content, 'html.parser')

    # Extract the synonyms and antonyms for the comparison word
    # Find the meta tag containing synonyms and antonyms and extract its content
    meta_tag = soup.find("meta", attrs={"name": "description"})
    content = meta_tag["content"]

    # Convert the content to a list of synonyms and antonyms
    syn_ant_list = content.split(";")
    synonyms_str = syn_ant_list[0].split(":")[1].strip()
    antonyms_str = syn_ant_list[1].split(":")[1].strip()

    # Print the results
    print(f"Synonyms: {synonyms_str}")
    print(f"Antonyms: {antonyms_str}")


    # Iterate over the rows in the sheet corresponding to the comparison word, and count how many of the response words
    # are present in the synonyms and antonyms lists
    for j, row in enumerate(input_ws.iter_rows(min_row=2, values_only=True), start=2):
        if row[1] == comparison:
            responses = row[0].split(" / ")
            syn_count = sum(1 for response in responses if response in 'synonyms')
            ant_count = sum(1 for response in responses if response in 'antonyms')
            input_ws.cell(row=j, column=3, value=syn_count)
            input_ws.cell(row=j, column=4, value=ant_count)

# Save the updated input file
input_wb.save('input2.xlsx')

我想这(下面)是最相关的部分,但我不确定问题是否来自其他原因。


# Get the HTML content of the Merriam-Webster thesaurus page for the comparison word
    response = requests.get(f'https://www.merriam-webster.com/thesaurus/{comparison}')
    soup = BeautifulSoup(response.content, 'html.parser')

    # Extract the synonyms and antonyms for the comparison word
    # Find the meta tag containing synonyms and antonyms and extract its content
    meta_tag = soup.find("meta", attrs={"name": "description"})
    content = meta_tag["content"]

这是我得到的错误:

追溯(最后一次通话): 文件“/Users/zephaniahsainta/Desktop/pythoning/ver3prints.py”,第 25 行,位于 content = meta_tag["内容"] TypeError: 'NoneType' 对象不可订阅

!我应该注意到它正确地打印了同义词和反义词,但错误意味着它没有完成运行代码并记录我正在寻找的数字。

这是它为 {comparison} 词 'kind' 打印的内容 同义词:类型、种类、流派、品种、条纹、品种、自然、喜欢 反义词:轻率、轻率、不友善、轻率、粗心、漠不关心、漫不经心、冷漠

非常感谢任何帮助!!!

我试过稍微修改一下代码,但我的其他版本没有打印任何同义词或反义词,所以我猜他们的情况更糟。我是 Python 的新手,所以我不确定如何自己解决这个问题。

python html beautifulsoup html-parsing meta
© www.soinside.com 2019 - 2024. All rights reserved.