无法根据来源的内容写日期

问题描述 投票:1回答:1

我已经在python中编写了一个与selenium结合使用的脚本来解析网页中的一些动态内容并相应地将它们写入csv文件。以下脚本可以无误地执行此操作,除了一件事the date

如果您查看该网站的内容,您会发现该表格数据中没有提及年份。

但是,当我单击输出文件中Date列标题下的任何单元格时,默认情况下,excel将其计为当前年份,而the date应为2004。我如何根据下图2中显示的内容制作年份2004

我正在尝试的脚本:

import csv
import datetime
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "http://info.nowgoal.com/en/League/2004-2005/36.html"

def get_information(driver,link):
    driver.get(link)
    for items in wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,'table#Table3 tr')))[2:]:
        try:
            date = items.find_elements_by_css_selector("td")[1].text.split("\n")[0]
            date = datetime.datetime.strptime(date, '%m-%d').strftime('%d-%B')
        except Exception: date = ""
        try:
            match_name = items.find_elements_by_css_selector("td")[2].find_element_by_tag_name("a").text
        except Exception: match_name = ""
        writer.writerow([date,match_name])
        print(date,match_name)

if __name__ == '__main__':
    driver = webdriver.Chrome()
    wait = WebDriverWait(driver,10)
    with open("outputfile.csv","w",newline="") as infile:
        writer = csv.writer(infile)
        writer.writerow(['Date','Match name'])
        try:
            get_information(driver,url)
        finally:  
            driver.quit()

这是日期在csv文件中的显示方式:enter image description here

这是您可以在该网页中看到的内容:

enter image description here

python python-3.x selenium selenium-webdriver web-scraping
1个回答
1
投票

您可以按如下方式将正确的年份添加到单元格:

import datetime

date = "05-15"
date = datetime.datetime.strptime(date, '%m-%d').replace(year=2004).strftime('%d-%B-%Y')

print(date)

这将显示:

15-May-2004
© www.soinside.com 2019 - 2024. All rights reserved.