从python批量插入postgres表中

问题描述 投票:0回答:1

我想将熊猫数据框中的值插入/更新到postgres表中。我在postgres表中有一个唯一的元组(a,b)。如果元组已经存在,我只想更新第三个值c,如果元组不存在,我想创建一个三元组(a,b,c)。

最有效的方法是什么?我猜是某种形式的批量插入,但是我不太确定该插入的确切程度。

python postgresql bulkinsert
1个回答
0
投票

您可以将数据框转换为CTE https://www.postgresql.org/docs/current/queries-with.html,然后将CTE中的数据插入表中。像这样:

def convert_df_to_cte(df):
    vals = ', \n'.join([f"{tuple([f'$str${e}$str$' for e in row])}" for row in df.values])
    vals = vals.replace("'$str$", "$str$")
    vals = vals.replace("$str$'", "$str$")
    vals = vals.replace('"$str$', "$str$")
    vals = vals.replace('$str$"', "$str$")
    vals = vals.replace('$str$nan$str$', 'NULL')

    columns = ', \n'.join(df.columns)

    sql = f"""
    WITH vals AS (
        SELECT 
            {columns}
        FROM 
            (VALUES {vals}) AS t ({columns})
    )
    """
    return sql


df = pd.DataFrame([[1, 2, 3]], columns=['col_1', 'col_2', 'col_3'])

cte_sql = convert_df_to_cte(df)
sql_to_insert = f"""
{cte_sql}

INSERT INTO schema.table (col_1, col_2, col_3)
SELECT 
    col_1::integer, -- don't forget to cast to right type to avoid errors
    col_2::integer, -- don't forget to cast to right type to avoid errors
    col_3::character varying
FROM 
    vals
ON CONFLICT (col_1, col_2) DO UPDATE SET
    col_3 = excluded.col_3;
"""

run_sql(sql)
© www.soinside.com 2019 - 2024. All rights reserved.