仅供参考,我是一个十足的 Python 新手。我有一个 for 循环,它从 S3 存储桶中提取一些对象信息并将其填充到 csv 文件中。对于检索其详细信息的每个对象,我需要将该数据填充到 csv 中。我的问题是我在 csv 中收到重复的条目。我对 csv 的期望是:
account_id;arn
键1;主体1
键2;主体2
键3;主体3 。 。 。 (直到循环遍历该文件夹中的所有对象)。
但是我得到的是(如下)。
account_id;arn
键1;主体1
account_id;arn
键1;主体1
account_id;arn
键2;主体2
account_id;arn
键1;主体1
account_id;arn
键2;主体2
account_id;arn
键3;主体3
而且每次我运行脚本时,它都会不断添加旧数据,这会加剧问题。
我当前的代码是:
for objects in my_bucket.objects.filter(Prefix="folderpath"):
key = objects.key
body = objects.get()['Body'].read()
field = ["account_id","arn"]
data = [
[key, body]
]
with open("my_file.csv", "a") as f:
writer = csv.writer(f, delimiter=";", lineterminator="\\n")
writer.writerow(field)
writer.writerows(data)
import csv
# Assuming my_bucket and folderpath are defined earlier
# and that csv module is imported
# Open the CSV file in write mode
with open("my_file.csv", "w") as f:
writer = csv.writer(f, delimiter=";", lineterminator="\\n")
# Write header row once at beginning of file
writer.writerow(["account_id", "arn"])
# Create a list to store content for all rows
data = []
# Iterate over objects in the S3 bucket
for objects in my_bucket.objects.filter(Prefix="folderpath"):
key = objects.key
body = objects.get()["Body"].read()
# Append the row
data.append([key, body])
# Write all the data at end in a single I/O operation
writer.writerows(data)
如果您在 Python 中使用 csv 模块,那就容易多了。
首先定义标题并准备 csv 文件,如下所示
import csv
with open('names.csv', 'w', newline='') as csvfile:
fieldnames = ['account_id', 'arn']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for objects in my_bucket.objects.filter(Prefix="folderpath"):
key = objects.key
body = objects.get()['Body'].read()
writer.writerow({'account_id': key, 'arn': body})