Python CSV:从 S3 追加数据,重复条目

问题描述 投票:0回答:2

仅供参考,我是一个十足的 Python 新手。我有一个 for 循环,它从 S3 存储桶中提取一些对象信息并将其填充到 csv 文件中。对于检索其详细信息的每个对象,我需要将该数据填充到 csv 中。我的问题是我在 csv 中收到重复的条目。我对 csv 的期望是:

account_id;arn

键1;主体1

键2;主体2

键3;主体3 。 。 。 (直到循环遍历该文件夹中的所有对象)。

但是我得到的是(如下)。

account_id;arn

键1;主体1

account_id;arn

键1;主体1

account_id;arn

键2;主体2

account_id;arn

键1;主体1

account_id;arn

键2;主体2

account_id;arn

键3;主体3

而且每次我运行脚本时,它都会不断添加旧数据,这会加剧问题。

我当前的代码是:

for objects in my_bucket.objects.filter(Prefix="folderpath"):
    key = objects.key
    body = objects.get()['Body'].read()
    field = ["account_id","arn"]
    data = [
        [key, body]
    ]
    with open("my_file.csv", "a") as f:
    writer = csv.writer(f, delimiter=";", lineterminator="\\n")
    writer.writerow(field)
    writer.writerows(data)
python amazon-web-services csv boto3
2个回答
0
投票
import csv

# Assuming my_bucket and folderpath are defined earlier
# and that csv module is imported

# Open the CSV file in write mode
with open("my_file.csv", "w") as f:
    writer = csv.writer(f, delimiter=";", lineterminator="\\n")

    # Write header row once at beginning of file
    writer.writerow(["account_id", "arn"])

    # Create a list to store content for all rows
    data = []

    # Iterate over objects in the S3 bucket
    for objects in my_bucket.objects.filter(Prefix="folderpath"):
        key = objects.key
        body = objects.get()["Body"].read()

        # Append the row
        data.append([key, body])

    # Write all the data at end in a single I/O operation
    writer.writerows(data)

0
投票

如果您在 Python 中使用 csv 模块,那就容易多了。

首先定义标题并准备 csv 文件,如下所示

import csv

with open('names.csv', 'w', newline='') as csvfile:
    fieldnames = ['account_id', 'arn']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    
    for objects in my_bucket.objects.filter(Prefix="folderpath"):
        key = objects.key
        body = objects.get()['Body'].read()
        
        writer.writerow({'account_id': key, 'arn': body})
© www.soinside.com 2019 - 2024. All rights reserved.