Python：对象或字符串为时间格式-数据字符串格式错误-并非所有行都存在％H

Question

之前曾询问过similar question，但未收到任何回复

我已经在许多论坛中寻找解决方案。其他问题涉及一年，但我的问题不涉及-只是H：M：S

我在网络上抓取了此data，该邮件返回了

时间-36:42 38:34 1:38:32 1:41:18

我需要像这样的几分钟36.70 38.57 98.53 101.30

为此，我尝试了此操作：

time_mins = []
for i in time_list:
    h, m, s = i.split(':')
    math = (int(h) * 3600 + int(m) * 60 + int(s))/60
    time_mins.append(math)

但是那不起作用，因为36:42的格式不是H：M：S，所以我尝试使用此格式转换36:42

df1.loc[1:,6] = df1[6]+ timedelta(hours=0)

和此

df1['minutes'] = pd.to_datetime(df1[6], format='%H:%M:%S')

但是没有运气。

我可以在提取阶段这样做吗？我必须做500多个行

row_td = soup.find_all('td')

如果没有，转换为数据帧后如何处理

提前感谢

Answer 1

如果您的输入（以字符串表示的timedelta仅包含小时/分钟/秒（没有天等），并且至少包含几秒钟，则可以使用应用于该列的自定义函数：

import pandas as pd

df = pd.DataFrame({'Time': ['36:42', '38:34', '1:38:32', '1:41:18']})

def to_minutes(s):
    # split string s on ':', reverse so that seconds come first and map to type int
    # multiply the result with elements from tuple (1/60, 1, 60) to get minutes for each value
    # return the sum of these multiplications
    return sum(a*b for a, b in zip(map(int, s.split(':')[::-1]), (1/60, 1, 60)))

df['Minutes'] = df['Time'].apply(to_minutes)
# df['Minutes']
# 0     36.700000
# 1     38.566667
# 2     98.533333
# 3    101.300000
# Name: Minutes, dtype: float64

Python：对象或字符串为时间格式-数据字符串格式错误-并非所有行都存在％H

问题描述投票：1回答：1

1个回答

最新问题

Python：对象或字符串为时间格式-数据字符串格式错误-并非所有行都存在％H

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1