我正在复制网页上的日期,将其转换为一年中的天数(01/02/2019 = 2),然后从我已经计算出的另一个数字中减去它。
我尝试用Xpath复制日期,但无法解决。
谢谢您的帮助
您必须安装请求和BeautifulSoup4库才能解析您的html页面:
pip install requests
pip install BeautifulSoup4
以及您问题的答案:
import requests
import datetime
from bs4 import BeautifulSoup
def get_html(url):
response = requests.urlopen(url)
return response.read()
# Entry here your url and parsing will working with your site:
# html = get_html("https://your/url.html")
# But for example I create test html-page:
html = ['<html><head><title>Page title</title></head>',
'<body><p id="date">01/02/2019</p></body>',
'</html>']
soup = BeautifulSoup(''.join(html))
# You should parse this page with bs4
date = soup.find("p", id="date").string
# And get day
day = int(datetime.datetime.strptime(date, '%m/%d/%Y').strftime('%d'))
您必须签出BeautifulSoup4库来解析您的页面:https://www.crummy.com/software/BeautifulSoup/bs4/doc/