Python：如何使用特定文本抓取 head 标签内列表标签的内容

Question

导入请求和 BeautifulSoup

我想抓取“目标”部分，但出现如下错误。

AttributeError：“NoneType”对象没有属性“next_sibling”

另外，我想为每节课制作 csv 表。

import requests
from bs4 import BeautifulSoup

url = "https://studio.code.org/s/web-development-2023/lessons/1"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
element = soup.find(text="Students will be able to:")
text = element.next_sibling.get_text()

print(text)

Answer 1

您在网页中看到的文本以 Json 形式编码在

<script>

元素的属性中（因此 BeautifulSoup 看不到它）。为了实现您可以做到的目标：

import json

import requests
from bs4 import BeautifulSoup

url = "https://studio.code.org/s/web-development-2023/lessons/1"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

data = json.loads(soup.select_one("[data-lesson]")["data-lesson"])

for o in data["objectives"]:
    print(o["description"])

打印

Create a prototype of a web design to meet the needs of a user using the problem-solving process
Identify features of a web design that match the needs of users
Understand the steps of the problem-solving process

Python：如何使用特定文本抓取 head 标签内列表标签的内容

问题描述投票：0回答：1

导入请求和 BeautifulSoup

1个回答

最新问题

Python：如何使用特定文本抓取 head 标签内列表标签的内容

问题描述 投票：0回答：1

导入请求和 BeautifulSoup

1个回答

最新问题

问题描述投票：0回答：1