我正在尝试使用 PYODBC python 包通过 SQL 合并语句传递 JSON 文件中的值。以下是 JSON 文件的示例,其中包含 2023 年 162 个国家/地区的美元汇率数据:
{"2023-01-01": {"USD": 1, "AED": 3.6725, "AFN": 88.74103815},
"2023-01-02": {"USD": 1, "AED": 3.6725, "AFN": 89.02144276}}
我使用以下代码将其提取为我一直用于 Python 到 SQL 合并作业的格式(列表列表):
with open('ExchangeRates.json', 'r') as f:
data = json.load(f)
result = []
for date, conversion_rates in data.items():
for currency, rate in conversion_rates.items():
result.append(['USD', currency, rate, parser.parse(date)])
结果示例:
[['USD', 'XOF', 593.76331894, '2023-12-30'], ['USD', 'XPF', 108.01769686, '2023-12-30'],
['USD', 'YER', 247.14186482, '2023-12-30'], ['USD', 'ZAR', 18.35621464, '2023-12-30'],
['USD', 'ZMW', 25.69324612, '2023-12-30'], ['USD', 'ZWL', 6047.08996546, '2023-12-30']]
我传递给 pyodbc 的 sql 和 params 参数
cursor.execute()
:
sql = """
MERGE INTO database.dbo.table_name AS Target
USING (
VALUES {}
) AS Source (currency_from, currency_to, factor, timestamp)
ON Target.currency_from = Source.currency_from
AND Target.currency_to = Source.currency_to
AND CAST(Target.timestamp as date) = CAST(Source.timestamp as date)
WHEN NOT MATCHED THEN
INSERT (currency_from, currency_to, factor, timestamp) VALUES (Source.currency_from, Source.currency_to, Source.factor, Source.timestamp);
""".format(','.join(['(?,?,?,?)' for _ in range(len(data))]))
params = [item for sublist in data for item in sublist]
cnxn = pyodbc.connect(conn_string)
crsr = cnxn.cursor()
try:
crsr.execute(sql, params)
except Exception as e:
crsr.rollback()
print(e)
print('Transaction rollback')
else:
cnxn.commit()
crsr.close()
cnxn.close()
此代码过去曾有效,并且一切似乎都是正确的。我通过打印它们的输出分别检查了每个部分,发现 SQL 代码字符串上的格式连接操作插入 (?,?,?,?) 58,968 次(162 个国家 * 364 天),并且 params 变量将每个部分分开将列值合并到一个包含 235,872 个元素(58,968 行 * 4 列)的长列表中,这一切看起来都是正确的。我检查了我的工作查询,它们将相同的内容传递到
cursor.execute(sql, params)
,只是数据少得多。
我每天做的Python工作连接到一个网站来获取当天美元到其他162个国家的汇率,效果非常好。我唯一能想到的是我向 pyodbc 游标执行函数传递了太多数据。请帮助我弄清楚为什么 pyodbc / SQL 认为我提供“-26272 参数标记”以及如何纠正我的编码错误。
正如 AlwaysLearning 所说,
VALUES
一次仅接受 1000 行。这意味着我收到错误是因为我输入了 VALUES
58 倍最大行数。
我使用的解决方案是分批合并250行(因为误读了注释,认为只能插入1000个元素,而不是1000行)。此代码替换了旧的 SQL MERGE 部分:
batch_size = 250
batches = [data[i:i + batch_size] for i in range(0, len(data), batch_size)]
# Prepare SQL template
sql_template = """
MERGE INTO database.dbo.table_name AS Target
USING (
VALUES {}
) AS Source (currency_from, currency_to, factor, timestamp)
ON Target.currency_from = Source.currency_from
AND Target.currency_to = Source.currency_to
AND CAST(Target.timestamp as date) = CAST(Source.timestamp as date)
WHEN NOT MATCHED THEN
INSERT (currency_from, currency_to, factor, timestamp) VALUES (Source.currency_from, Source.currency_to, Source.factor, Source.timestamp);
"""
# Execute batches
cnxn = pyodbc.connect(conn_string)
crsr = cnxn.cursor()
try:
for batch in batches:
# Create a comma-separated list of placeholders for each batch
placeholders = ','.join(['(?,?,?,?)'] * len(batch))
sql = sql_template.format(placeholders)
params = [item for sublist in batch for item in sublist]
crsr.execute(sql, params)
cnxn.commit()
except Exception as e:
crsr.rollback()
print(e)
print('Transaction rollback')
finally:
crsr.close()
cnxn.close()