我编写了此算法以下代码:
from bs4 import BeautifulSoup
import requests
import time
data = [{"operationName": "SearchQuery", "variables": {"query": "Pedro Alvares", "after": None, "first": 10},
"query": "query SearchQuery($query: String!, $first: Int!, $after: ID) {\n questionSearch(query: $query, first: $first, after: $after) {\n count\n edges {\n node {\n id\n databaseId\n author {\n id\n databaseId\n isDeleted\n nick\n avatar {\n thumbnailUrl\n __typename\n }\n rank {\n name\n __typename\n }\n __typename\n }\n content\n answers {\n nodes {\n thanksCount\n ratesCount\n rating\n __typename\n }\n hasVerified\n __typename\n }\n __typename\n }\n highlight {\n contentFragments\n __typename\n }\n __typename\n }\n __typename\n }\n}\n"}]
r = requests.post("https://brainly.com.br/graphql/pt", json=data).json()
p=[]
for item in r[0]['data']['questionSearch']['edges']:
rst=(f"https://brainly.com.br/tarefa/{item['node']['databaseId']}")
p.append(rst)
for ele in p:
而且我想打印每个链接的HTML
,我想这样做:
for ele in p:
r = requests.get(p).text
time.sleep(5)
print(r)
我的问题以及是否有改进此循环的方法。然后,我将过滤这些HTML
:
首先,在第二行中用ele替换p然后您无需设置time.sleep(5),因为它直到请求完成才开始运行,并且此延迟没有用。