将星期名称更改为日期

问题描述 投票:0回答:1

我正在抓取一个活动网站(活动名称、日期和时间)。我在 Excel 中得到的输出如下:

我想将日期名称更改为实际日期,例如星期五应该是 12.03。等等。事件日期和时间在此变量中:

dates_and_times = []

for container in containers:
    event_names.append(container.find('h2').getText())
    date_time = container.find('p', class_='Typography_root__487rx #585163 Typography_body-md__487rx event-card__clamp-line--one Typography_align-match-parent__487rx').getText()
    if date_time.endswith("PM") or date_time.endswith("AM"):
        dates_and_times.append(date_time)

我知道我需要用时间以某种方式更换它,但我不知道该怎么做。有人可以帮我吗?谢谢你。

这是完整的代码:

from bs4 import BeautifulSoup
import requests
from datetime import datetime, timedelta
import pandas as pd

# Get the current date
current_date = datetime.now().strftime("%Y-%m-%d")

# Calculate the end date (current date + 7 days)
end_date = (datetime.now() + timedelta(days=7)).strftime("%Y-%m-%d")

# Construct the URL with dynamic start and end dates
URL = f"https://www.eventbrite.com/d/canada--toronto/all-events/?page=1&start_date={current_date}&end_date={end_date}"

header = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36",
    "Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8"
}

response = requests.get(URL, headers=header)
response.raise_for_status()

web_page = response.text
soup = BeautifulSoup(web_page, 'html.parser')

containers = soup.find_all('section', class_='event-card-details')

event_names = []
dates_and_times = []

for container in containers:
    event_names.append(container.find('h2').getText())
    date_time = container.find('p', class_='Typography_root__487rx #585163 Typography_body-md__487rx event-card__clamp-line--one Typography_align-match-parent__487rx').getText()
    if date_time.endswith("PM") or date_time.endswith("AM"):
        dates_and_times.append(date_time)

event_names_clean = list(set(event_names))

data = {"Event Name": event_names_clean, "Event Date and Time": dates_and_times}
df = pd.DataFrame(data)

# Save the DataFrame to an Excel file
df.to_excel("events_data.xlsx", index=False)

print("Data saved to 'events_data.xlsx'")
python web-scraping beautifulsoup
1个回答
0
投票

您可以尝试解析事件日期和时间,提取日期名称,然后将其替换为相应的日期。

date_time = container.find('p', class_='Typography_root__487rx #585163 Typography_body-md__487rx event-card__clamp-line--one Typography_align-match-parent__487rx').getText()
    if date_time.endswith("PM") or date_time.endswith("AM"):
       
        day_name = date_time.split(',')[0]
      
        event_date = (datetime.now() + timedelta(days=(datetime.strptime(day_name, '%A').weekday() - datetime.now().weekday()))).strftime("%d.%m.")
     
        date_time = date_time.replace(day_name, event_date)
        dates_and_times.append(date_time)
© www.soinside.com 2019 - 2024. All rights reserved.