我有带有风向 (wd) 和速度 (ws) 的 csv 文件:
datetime wd ws
06.02.2023 00:55 297 3.2
06.02.2023 01:55 296 2.7
06.02.2023 02:55 299 3.0
06.02.2023 03:55 302 3.5
我想将时间重新采样为一小时,如下所示:
datetime wd ws
06.02.2023 01:00 297 3.2
06.02.2023 02:00 296 2.7
06.02.2023 03:00 299 3.0
06.02.2023 04:00 302 3.5
到目前为止,我一直在尝试使用这个脚本:
import pandas as pd
filename = r"data.csv"
df = pd.read_csv(filename, header=None, sep=";", skiprows=1, usecols = [0,3,4], names = ["datetime","wd","ws"])
# sspecifying the date format
df['index_time']= pd.to_datetime(df['local_time'], format='%d.%m.%Y %H:%M')
# change date-time column to index
df.set_index('index_time', inplace=True)
# trying to resample
df_resampled = df.resample(rule='H')
打印(df_resampled)输出:
local_time wd ws
index_time
2023-02-06 00:00:00 NaN NaN NaN
2023-02-06 01:00:00 NaN NaN NaN
2023-02-06 02:00:00 NaN NaN NaN
2023-02-06 03:00:00 NaN NaN NaN
如何仅对时间进行重新采样而保留数据不变?
也许我误解了,但您似乎只是想将时间向上或向下四舍五入到最接近的小时?如果是这样,那么以下代码应该可以工作:
import pandas as pd
from datetime import datetime, timedelta
data = {
'date': ['06.02.2023', '06.02.2023', '06.02.2023', '06.02.2023'],
'time': ['0:55', '1:55', '2:55', '3:55'],
'wd': [297, 296, 299, 302],
'ws': [3.2, 2.7, 3.0, 3.5]
}
df = pd.DataFrame(data)
def round_up_to_nearest_hour(time_str):
time_obj = datetime.strptime(time_str, '%H:%M') #Convert string of time to time object
if time_obj.minute >= 30: #If minutes are above 30...
time_obj += timedelta(hours=1) #...increase the hour by 1
time_obj = time_obj.replace(minute=0) #Reset minutes to zero
return time_obj.strftime('%H:%M') #Return the new time
df['time'] = df['time'].apply(round_up_to_nearest_hour) #Apply this function to all values in the time column of df
我在 Jupyter 中运行了这个(见下文:我将一个值切换到低于 30 分钟标记),看起来它正在做你想要的事情?希望这有帮助!