我想抓取“目标”部分,但出现如下错误。
AttributeError:“NoneType”对象没有属性“next_sibling”
另外,我想为每节课制作 csv 表。
import requests
from bs4 import BeautifulSoup
url = "https://studio.code.org/s/web-development-2023/lessons/1"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
element = soup.find(text="Students will be able to:")
text = element.next_sibling.get_text()
print(text)
您在网页中看到的文本以 Json 形式编码在
<script>
元素的属性中(因此 BeautifulSoup 看不到它)。为了实现您可以做到的目标:
import json
import requests
from bs4 import BeautifulSoup
url = "https://studio.code.org/s/web-development-2023/lessons/1"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
data = json.loads(soup.select_one("[data-lesson]")["data-lesson"])
for o in data["objectives"]:
print(o["description"])
打印
Create a prototype of a web design to meet the needs of a user using the problem-solving process
Identify features of a web design that match the needs of users
Understand the steps of the problem-solving process