从CSV写入SQL数据库

Question

我手头上有这些csv文件，我必须将它们上传到远程数据库，并且我已经将pydbc和python中的csv库一起使用。我不知道为什么，但是它非常慢（大约30秒100行），其中一些我必须上传的csv文件超过了3万行。我也尝试过使用熊猫，但是速度没有变化。这或多或少是我的代码。不必要的部分已被省略。

if len(sys.argv) == 1:
print("This program needs needs an input state")
exit()

state_code = str(sys.argv[1])

f = open(state_code+".csv", "r")

reader = csv.reader(f, delimiter=',')


 cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)

insert_query = '''INSERT INTO table (Zipcode, Pers_Property_Coverage, Deductible, 
Liability,Average_Rate,
            Highest_Rate, Lowest_Rate, CREATE_DATE, Active_Flag)
            VALUES(?,?,?,?,?,?,?,?,?)'''
for row in reader:
    zipcode = row[0]

    if len(zipcode) == 4:
        zipcode = "0" + zipcode

    ppc=row[1][1:]
    ppc=ppc.replace(',', '')

    deductible = row[2][1:]
    deductible = deductible.replace(',', '')

    liability = row[3][1:]
    liability = liability.replace(',', '')

    average_rate = row[4][1:]
    average_rate = average_rate.replace(',', '')

    highest_rate = row[5][1:]
    highest_rate = highest_rate.replace(',', '')

    lowest_rate=row[6][1:]
    lowest_rate = lowest_rate.replace(',', '')

    ctr=ctr+1

    if ctr % 100 == 0:
        print("Time Elapsed = ", round(time.time() - start_time)," seconds")

    values = (zipcode, ppc, deductible, liability, average_rate, highest_rate, lowest_rate, date, "Y")

    print("Inserting "+zipcode ,ppc , deductible, liability, average_rate, highest_rate, lowest_rate,date, "Y")

    cursor.execute(insert_query, values)
cnxn.commit()

Answer 1

使用选项executemany更新代码以使用pyodbc fast_executemany=True是节省时间的简便方法：

https://github.com/mkleehammer/pyodbc/wiki/Cursor#executemanysql-params-with-fast_executemanytrue

从文件中探索批量插入可能是另一种选择，尽管它很可能不会使用pyodbc或python：

https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15

从CSV写入SQL数据库

问题描述投票：1回答：1

1个回答

最新问题

从CSV写入SQL数据库

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1