Python:正在进行中,开始日期时间和结束日期时间为每小时级别

问题描述 投票:0回答:1

过去一年我一直在跟踪我的游戏会话 - 只是为了获得我关心的数据并学习 python。 现在我想知道(并绘制——但还不重要)在整个时间和所有活动中,a 播放最多的时间(小时:0 到 23)——自跟踪开始以来每天。

样品:

session_id 游戏编号 开始日期时间 结束日期时间
001 74 2023-02-22 13:15:00 2023-02-22 15:30:00
002 127 2023-02-23 13:30:00 2023-02-23 13:45:00
003 74 2023-02-24 14:40:00 2023-02-24 15:00:00

最后我想看这个信息-不需要计算栏:

hour_of_day sum_hours_played avg_hours_played_per_day 计算
13 1.00 0.33 (0.75 + 0.25) / 3 天
14 1.33 0.44 (1.00 + 0.33) / 3 天
15 0.5 0.17 (0.5) / 3 天

简而言之,我不只是想看我玩了几个小时(玩过:1,没玩过0),还想知道我玩了特定小时的比例。

我在网上看到了一些方法,但几乎所有方法都只是每月或每天对单个是或否事件进行计数/求和。他们不计算一天/小时的比例。

所以,我很高兴你有任何提示。

python-3.x pandas datetime process duration
1个回答
0
投票

设置:

import pandas as pd

# Load your data into a DataFrame
data = {
    'session_id': [1, 2, 3],
    'game_id': [74, 127, 74],
    'start_datetime': ['2023-02-22 13:15:00', '2023-02-23 13:30:00', '2023-02-24 14:40:00'],
    'end_datetime': ['2023-02-22 15:30:00', '2023-02-23 13:45:00', '2023-02-24 15:00:00']
}

df = pd.DataFrame(data)

# Convert the 'start_datetime' and 'end_datetime' columns to datetime objects
df['start_datetime'] = pd.to_datetime(df['start_datetime'])
df['end_datetime'] = pd.to_datetime(df['end_datetime'])

# Calculate the duration of each gaming session
df['duration'] = df['end_datetime'] - df['start_datetime']

# Initialize an empty dictionary to store the hours played
hours_played = {i: 0 for i in range(24)}

诀窍是将每个会话分成几个小时:

# Break down each session into hours and sum the proportion of hours played
for _, row in df.iterrows():
    start = row['start_datetime']
    end = row['end_datetime']
    duration = row['duration']

    # Loop over the hours involved
    while start < end:

        # Calculate the end of the hour currently considered
        hour_start = start.replace(minute=0, second=0)
        hour_end = hour_start + pd.Timedelta(hours=1)

        played = min(hour_end, end) - start  # Here take what ends first (the hour or the session) and substract the start time
        hours_played[start.hour] += played.total_seconds() / 3600  # Here add the time played to the current value in the dictionary
        
        start = hour_end  # For the (possible) next iteration of the while look, set the start to the end of the hour currently considered

# Calculate the average hours played per day
total_days = (df['end_datetime'].max() - df['start_datetime'].min()).days + 1
avg_hours_played = {hour: hours / total_days for hour, hours in hours_played.items()}

# Create a DataFrame to display the results
results = pd.DataFrame(list(avg_hours_played.items()), columns=['hour_of_day', 'avg_hours_played_per_day'])
results['sum_hours_played'] = [hours_played[hour] for hour in results['hour_of_day']]
results = results[['hour_of_day', 'sum_hours_played', 'avg_hours_played_per_day']]
print(results)

我希望我的评论是可以理解的

© www.soinside.com 2019 - 2024. All rights reserved.