如何避免 Pandas 丢失 1 位 datetime2 精度?

问题描述 投票:0回答:1

我正在使用

read_sql_query
函数将一些 SQL Server 数据读入 Pandas 中。问题是我在
DATETIME2
列上失去了一位精度。

编码示例:

query = pd.read_sql_query(
'''
    SELECT CAST('2021-05-06 15:44:29.1234567' AS DATETIME2(7)) AS ServiceDate
''', source_engine)

df = pd.DataFrame(query)

df

结果:

Out[71]: 
0   2021-05-06 15:44:29.123456

这确实会导致数据比较出现问题,所以我需要确保精度相同。

我怎样才能阻止这种情况发生?

python python-3.x sql-server pandas dataframe
1个回答
0
投票

我刚刚遇到这个问题。使用 pyodbc 和 pymssql,日期时间值被截断为微秒(Python

datetime.datetime
的最大精度),尽管 DataFrame 列是 NumPy datetime64[ns]。

import pandas as pd
import sqlalchemy as sa

engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")

with engine.begin() as conn:
    conn.exec_driver_sql("drop table if exists zzz")
    conn.exec_driver_sql("create table zzz (dt2 datetime2)")
    conn.exec_driver_sql(
        "insert into zzz (dt2) values ('2001-01-01 00:00:00.1234567')"
    )

df = pd.read_sql_query("select dt2 from zzz", engine)

print(df.info())
"""
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   dt2     1 non-null      datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 140.0 bytes
None
"""

print(df)
"""
                         dt2
0 2001-01-01 00:00:00.123456
"""

解决方法是将列检索为 varchar 并使用

dtype=

转换类型
df = pd.read_sql_query(
    "select cast(dt2 as varchar(30)) as dt2 from zzz",
    engine,
    dtype=dict(dt2="datetime64[ns]"),
)

print(df.info())
"""
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   dt2     1 non-null      datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 140.0 bytes
None
"""

print(df)
"""
                            dt2
0 2001-01-01 00:00:00.123456700
"""
© www.soinside.com 2019 - 2024. All rights reserved.