我在数据库中有一个表,其中包含与跨越10多年的交易相关的数百万行。由于将它们全部导入显然很浪费,因此我尝试导入限于特定月份范围的数据子集。当我尝试使用以下代码作为对连接的测试并导入前1000行时,它可以正常工作,但是当我在where子句中指定日期范围时,它将返回一个空数据框。
我将非常感谢我能提供的纠正此问题的任何帮助。在此先感谢
import pyodbc
import pandas as pd
conn = pyodbc.connect('Driver={SQL Server};'\
'Server=NAME;'\
'Database=DBNAME;'\
'Trusted_Connection=yes;')
tquery = """SELECT TOP (1000) * FROM [SALES Transactions_V];"""
df = pd.read_sql_query(tquery, conn)
df.dtypes
输出:
DW_Id int64
Company object
Campaign Initiative object
Closing Entry bool
Department Code object
Description object
Document No object
Document Type int64
Entry No int64
Expense Type object
GL Account No object
Incremental Field datetime64[ns]
Posting Date datetime64[ns]
Strategic Initiative object
Vendor No object
Vendor Name object
Amount float64
GBP Amount float64
Actual per CWT object
DW_Batch int64
DW_SourceCode object
DW_TimeStamp datetime64[ns]
dtype: object
df.head()
DW_Id Company Campaign Initiative Closing Entry Department Code Description Document No Document Type Entry No Expense Type ... Posting Date Strategic Initiative Vendor No Vendor Name Amount GBP Amount Actual per CWT DW_Batch DW_SourceCode DW_TimeStamp
0 1 ABC Co.,LLC None False AGDATA INC. PMJ10000 1 1 None ... 2007-02-27 None None None -125.25 0.0 None 13726 Nav 2020-05-11 08:50:37.437
1 2 ABC Co.,LLC None False AGDATA INC. PMJ10000 1 2 None ... 2007-02-27 None AGD01 AGDATA, INC. 125.25 0.0 None 13726 Nav 2020-05-11 08:50:37.437
2 3 ABC Co.,LLC None False AGDATA INC. PMJ10000 1 3 None ... 2007-02-27 None AGD01 AGDATA, INC. 125.25 0.0 None 13726 Nav 2020-05-11 08:50:37.437
但是,当我使用以下代码过滤日期范围在04-01-2020和04-30-2020之间时,它给了我一个空的数据框
df1 = pd.read_sql_query('SELECT * FROM [SALES Transactions_V] WHERE [Posting Date] BETWEEN ''2020-04-01'' AND ''2020-04-30'';', conn)
df1.dtypes
DW_Id object
Company object
Campaign Initiative object
Closing Entry object
Department Code object
Description object
Document No object
Document Type object
Entry No object
Expense Type object
GL Account No object
Incremental Field object
Posting Date object
Strategic Initiative object
Vendor No object
Vendor Name object
Amount object
GBP Amount object
Actual per CWT object
DW_Batch object
DW_SourceCode object
DW_TimeStamp object
dtype: object
我相信date子句的范围是什么,但是我找不到解决此问题的解决方案,并且非常感谢您的投入。谢谢!
将值传递给SQL查询时,请考虑parameterization,行业最佳实践,pyobbc
和pyobbc
支持。这样做可以避免转义引号,以及连接或插值文字值或变量。
pandas.read_sql_query
或按日期部分:
pandas.read_sql_query