我在下面有此算法:
from bs4 import BeautifulSoup
import requests
data = [{"operationName": "SearchQuery", "variables": {"query": "Python", "after": None, "first": 2},
"query": "query SearchQuery($query: String!, $first: Int!, $after: ID) {\n questionSearch(query: $query, first: $first, after: $after) {\n count\n edges {\n node {\n id\n databaseId\n author {\n id\n databaseId\n isDeleted\n nick\n avatar {\n thumbnailUrl\n __typename\n }\n rank {\n name\n __typename\n }\n __typename\n }\n content\n answers {\n nodes {\n thanksCount\n ratesCount\n rating\n __typename\n }\n hasVerified\n __typename\n }\n __typename\n }\n highlight {\n contentFragments\n __typename\n }\n __typename\n }\n __typename\n }\n}\n"}]
r = requests.post("https://brainly.com.br/graphql/pt", json=data).json()
p=[]
for item in r[0]['data']['questionSearch']['edges']:
rst=(f"https://brainly.com.br/tarefa/{item['node']['databaseId']}")
p.append(rst)
for ele in p:
r = requests.get(ele).text
soup = BeautifulSoup(r,'html.parser')
for n in soup.find_all('div', attrs={'class': 'brn-content-image'}):
print(n.find('h1').text)
而且我需要过滤这2个HTML
:
<div class="brn-content-image">
<h1 class="sg-text sg-text--large sg-text--regular">
O que é for em python?
</h1>
和:
<div class="brn-content-image">
<h1 class="sg-text sg-text--large sg-text--regular">
Linguagem ( Python )<br /><br />a) Quem foi(ram) o(s) criador(es) do python? <br /><br />b) Cite como se declara uma variáveis:<br /><br />c) O que é uma variável?<br /><br />d) O que é uma função?<br /><br />e) para que serve às { } no python?
</h1>
</div>
预期的出口:
1 h1-语言(Python)
a)Quoi foi(ram)o criador(es)做python吗?
b)引用声明性变体:
c)O que umavariável?
d)O queéumafunção?
e)para que服务às{}没有python吗?
用于em python的2 h1 -O queé?
我在同一变量中有2个HTML
页面;我只能过滤2 h1的问题,即>> em python的>> O queé?
而且我需要同时打印两个!我在做什么错:
我在下面有此算法:从bs4导入BeautifulSoup导入请求数据= [{“ operationName”:“ SearchQuery”,“ variables”:{“ query”:“ Python”,“ after”:None,“ first”:2 },“ query”:“ ...
您在循环外使用的[soup
变量,这就是为什么您仅获得第二个html值。它应该在循环内。请立即尝试。