抓取表格仅返回“表格”,而不返回表格的内容。这是我的code
:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "http://data.eastmoney.com/gdhs/detail/600798.html"
html = urlopen(url)
soup = BeautifulSoup(html, 'lxml')
table = soup.find_all('table')
print(table)
您发现该表与代码配合得很好。因为该表由多个元素(tr / td)组成,所以您必须遍历这些元素以获得表单元格的内部文本。
# This grabs the first occurrence of a table on the web page. If you want the second occurrence of a table on the web page, use soup.find_all('table')[1], etc.
table = soup.find_all('table')[0]
# Use a splice if there are table headers. If you want to include the table headers, use table('tr')[0:]
for row in table('tr')[1:]:
print(row('td')[0].getText().strip())