将时间列从“对象”转换为“日期时间”数据类型后使用 df,我注意到数据发生变化,并且此变化不在位置

问题描述 投票:0回答:1
import pandas as pd
order_details_id  order_id order_date   order_time  item_id
0                     1         1     1/1/23  11:38:36 AM    109.0
1                     2         2     1/1/23  11:57:40 AM    108.0
2                     3         2     1/1/23  11:57:40 AM    124.0
3                     4         2     1/1/23  11:57:40 AM    117.0
4                     5         2     1/1/23  11:57:40 AM    129.0

df['order_date] = pd.to_datetime(df['order_date])
print(df)
order_details_id  order_id  order_date   order_time    item_id
0                     1         1  2023-01-01    11:38:36 AM      109
1                     2         2  2023-01-01    11:57:40 AM      108
2                     3         2  2023-01-01    11:57:40 AM      124
3                     4         2  2023-01-01    11:57:40 AM      117
4                     5         2  2023-01-01    11:57:40 AM      129
   

df['order_time] = pd.to_datetime(df['order_time])
print(df)
order_details_id  order_id  order_date          order_time     item_id
0                     1         1  2023-01-01   2023-12-29 11:38:36      109
1                     2         2  2023-01-01   2023-12-29 11:57:40      108
2                     3         2  2023-01-01   2023-12-29 11:57:40      124
3                     4         2  2023-01-01   2023-12-29 11:57:40      117
4                     5         2  2023-01-01   2023-12-29 11:57:40      129

我知道在

datetime
format=%y/%m/%d
中是强制性的,问题出在
order_time
列中,您会注意到日期从
2023-01-01
2023-12-29

的更改
python pandas datetime type-conversion format
1个回答
0
投票

您似乎期望小时列也包含日期信息,而实际上这两列都是计算确切日期和时间所必需的。

在日期时间转换之前组装它们

df['order_datetime'] = pd.to_datetime(df['order_date'].str.cat(df['order_time'], sep=' '))

输出:

   order_details_id  order_id order_date   order_time  item_id      order_datetime
0                 1         1     1/1/23  11:38:36 AM    109.0 2023-01-01 11:38:36
1                 2         2     1/1/23  11:57:40 AM    108.0 2023-01-01 11:57:40
2                 3         2     1/1/23  11:57:40 AM    124.0 2023-01-01 11:57:40
3                 4         2     1/1/23  11:57:40 AM    117.0 2023-01-01 11:57:40
4                 5         2     1/1/23  11:57:40 AM    129.0 2023-01-01 11:57:40

数据类型:

df.dtypes

order_details_id             int64
order_id                     int64
order_date                  object
order_time                  object
item_id                    float64
order_datetime      datetime64[ns]
dtype: object

参考:https://stackoverflow.com/a/19378497/12846804


或者, 您可能错误地读取了初始数据帧。将

order_date
order_time
读作单列,如下所示:

   order_details_id  order_id     order_date_time  item_id
0                 1         1  1/1/23 11:38:36 AM    109.0
1                 2         2  1/1/23 11:57:40 AM    108.0
2                 3         2  1/1/23 11:57:40 AM    124.0
3                 4         2  1/1/23 11:57:40 AM    117.0
4                 5         2  1/1/23 11:57:40 AM    129.0

您可以使用此生成器获得:

df = pd.DataFrame({'order_details_id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
                   'order_id': {0: 1, 1: 2, 2: 2, 3: 2, 4: 2},
                   'order_date_time': {0: '1/1/23 11:38:36 AM',
                                       1: '1/1/23 11:57:40 AM',
                                       2: '1/1/23 11:57:40 AM',
                                       3: '1/1/23 11:57:40 AM',
                                       4: '1/1/23 11:57:40 AM'},
                   'item_id': {0: 109.0, 1: 108.0, 2: 124.0, 3: 117.0, 4: 129.0}}

然后你的生产线就可以工作了:

df['order_dt'] = pd.to_datetime(df['order_date_time'])

   order_details_id  order_id     order_date_time  item_id            order_dt
0                 1         1  1/1/23 11:38:36 AM    109.0 2023-01-01 11:38:36
1                 2         2  1/1/23 11:57:40 AM    108.0 2023-01-01 11:57:40
2                 3         2  1/1/23 11:57:40 AM    124.0 2023-01-01 11:57:40
3                 4         2  1/1/23 11:57:40 AM    117.0 2023-01-01 11:57:40
4                 5         2  1/1/23 11:57:40 AM    129.0 2023-01-01 11:57:40

那么,您确定

order_date
order_time
是两个独立的列吗?

© www.soinside.com 2019 - 2024. All rights reserved.