我想知道如何抓取以下网站:http://chonos.ifop.cl/flow/
网页右侧有一个地图,当您单击 Highcharts 图表中左侧时间序列上显示的每个点时,我想迭代地提取这些序列,但我仍然不能。这是我到目前为止的代码:
from io import BytesIO
import gzip
site_url='http://chonos.ifop.cl/flow/'
r = urllib.request.urlopen(site_url)
site_content = r.read()
s = BeautifulSoup(site_content, 'html.parser')
print(s.prettify()[:100])
s.find_all('td')
s.find_all('table')
s.findAll('table',attrs={'class':'uk-table uk-table-small uk-table-striped'})
当我在 Firefox/Chrome 中使用
DevTools
(选项卡:Network
)查看浏览器在单击地图时发送到服务器的所有请求时,我会看到像下面这样的 url,它提供了一些 JSON 数据并且有名称 series
.
您可以点击此链接直接在浏览器中查看JSON数据
我也可以在代码中使用此链接
import requests
url = 'http://chonos.ifop.cl/flow/mapclick'
params = {
'REQUEST': 'GetFeatureInfo',
'SERVICE': 'WMS',
'SRS': 'EPSG:4326',
'STYLES': '',
'TRANSPARENT': 'true',
'VERSION': '1.1.1',
'FORMAT': 'image.png',
'BBOX': '-84.48486328125,-50.16282433381728,-59.54589843750001,-45.75219336063107',
'HEIGHT': '300',
'WIDTH': '1135',
'LAYERS': 'aguadulce:outlet_points',
'QUERY_LAYERS': 'aguadulce:outlet_points',
'INFO_FORMAT': 'text.html',
'LAT': '-46.528634695271684',
'LON': '-71.41113281250001',
'X': '595',
'Y': '51',
}
response = requests.get(url, params=params)
data = response.json()
for item in data['series']['sim']:
print(item)
结果:
[283996800000, 985.352]
[284083200000, 1115.734]
[284169600000, 1099.139]
[284256000000, 1146.895]
[284342400000, 1127.501]
[284428800000, 1146.251]
[284515200000, 1048.681]
[284601600000, 939.899]
[284688000000, 941.33]
[284774400000, 905.143]
...
在链接中我看到
LAT=
,LON=
- 所以如果您要更改纬度,经度`那么您应该获取其他位置的数据。
编辑:
正如@Modammed所说 - 当您单击特殊位置时,它会从类似的链接加载数据
https://chonos.ifop.cl/flow/stnclick?index=50
您可以像之前的链接一样使用此链接。
如果你改变
index
那么你会得到不同的位置。
import requests
url = 'http://chonos.ifop.cl/flow/stnclick'
params = {
'index': 0
}
for number in range(10):
params['index'] = number
response = requests.get(url, params=params)
data = response.json()
print('---', data['name'], '---')
#for item in data['series']['sim'][:5]: # show first 5 values
for item in data['series']['sim']: # show all values
print(item)
结果(每个位置的前 5 个值):
--- Rio Caleta En Tierra Del Fuego ---
[283996800000, 4.41]
[284083200000, 4.27]
[284169600000, 4.13]
[284256000000, 4.0]
[284342400000, 3.95]
--- Rio La Plata Antes Junta Rio Hueyusca ---
[283996800000, 4.43]
[284083200000, 4.15]
[284169600000, 3.88]
[284256000000, 3.63]
[284342400000, 3.39]
--- Rio Hueyusca En Camarones ---
[283996800000, 12.46]
[284083200000, 11.71]
[284169600000, 11.0]
[284256000000, 10.33]
[284342400000, 9.7]
--- Rio Negro En Las Lomas ---
[283996800000, 9.97]
[284083200000, 8.98]
[284169600000, 8.08]
[284256000000, 7.3]
[284342400000, 6.61]
--- Rio Maullin En Las Quemas ---
[283996800000, 35.37]
[284083200000, 33.34]
[284169600000, 31.53]
[284256000000, 29.8]
[284342400000, 28.47]