对不起,我对此有些陌生,所以我想获取某些json数据"getMe":"IneedThisData"
from bs4 import BeautifulSoup
import json
html_doc = """
<!DOCTYPE html>
<html>
<head>
<title>Sample</title>
</head>
<body>
<script type="text/javascript">utag_cfg_ovrd = window.utag_cfg_ovrd || {};utag_cfg_ovrd.noview = true;
</script>
<script async="" src="/assets/AppMeasurement.js">
</script>
<script>
window.REDUX_STATE = {"appConfig":
{"dataLab":"energy","minimum":"maximum":"getMe":"IneedThisData"}}
</script>
</body>
</html>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
data = json.loads(soup.find('script', 'window.REDUX_STATE').text)
我收到AttributeError: 'NoneType' object has no attribute 'text'
的错误我仍然停留在将数据加载到变量中。
假设"minimum":"maximum":"getMe"
是一个错字,而实际上是"minimum":"maximum","getMe"
没有错字,则可以使用以下代码:
soup = BeautifulSoup(html_doc, 'html.parser')
tags = soup.find_all("script")
#print(tags)
data = None
for t in tags:
text = str(t.contents[0])
if "window.REDUX_STATE" in text:
splits = text.split("=")
print(splits[1])
data = json.loads(splits[1])
print(data)
在您的代码中,soup.find('script', 'window.REDUX_STATE')
与任何标签都不匹配。