页面上的某些元素不可检索。我如何爬行?
已抓取的项目(2):地址,地球
1个未爬网项目:点
“ Points = soup.select('。Addr_point')”结尾处的这一部分无法抓取。我不知道原因(在红色虚线框中)。
请告知。
import urllib.parse
from bs4 import BeautifulSoup
import re
url = 'http://www.dooinauction.com/auction/ca_list.php'
req = urllib.request.Request(url) #
html = urllib.request.urlopen(req).read()
soup = BeautifulSoup(html, 'html.parser')
tots = soup.select('div.title_left font') #total
tot = int(re.findall('\d+', tots[0].text)[0])
print(f'total : {tot}건')
url = f'http://www.dooinauction.com/auction/ca_list.php?total_record={tot}&search_fm_off=1&search_fm_off=1&start=0'
html = urllib.request.urlopen(url).read()[enter image description here][1]
soup = BeautifulSoup(html, 'html.parser')
addrs = soup.select('.addr') # crawling OK
a_earths = soup.select('.list_class.bold') #crawling OK
points = soup.select('.addr_point') #crawling NO
print()
我浏览了您的网站,似乎看不到addr_points部分。我想也许这就是原因。screenshot