我有一个 csv 文件,看起来像这样:
我有以下代码,它读取 csv 文件,然后可以打印/访问 CSV 文件中的信息。
import csv
class CsvReader:
with open("Items.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read[0])
我将其作为打印的输出
['1', '5.99$, '1', 'Blueberry Muffin']
我如何将其格式化为字典,以标题作为键,信息作为元素?
例如,代码将输出:
{Item #: 1, Price: 5.99, Quantity: 1, Name: Blueberry Muffin}
我在这篇文章中引用并看到了很多相似之处:How do I read and write CSV files with Python?
但无法找到有关如何以这种方式专门格式化输出的更多细节,而不使用诸如 pandas 之类的东西,我不打算使用它。
如果您希望字典键作为字段(即列)那么您为什么要跳过它们。 这是简单的解决方案。
import csv
class CsvReader:
with open("Item.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
fields = next(reader)
data_read = []
for row in reader:
data_read.append(dict(zip(fields, row)))
print(data_read[0])
首先存储列名称并将其与每个行元素映射。
有几种方法可以做到这一点......我同意使用 Pandas 来读取简单文件可能有点大材小用。你可能会说,即使使用
csv_reader
也太过分了。 :)
无论如何,这里有 3 种变化。您需要做的就是捕获标签并将它们用作字典中的键。请注意,下面的方法将为您提供“字典列表”(或熊猫语中的“记录”类型格式)。另一种选择是使用项目编号作为第一个键的“词典的词典”,但本质上它与列表索引相同......所以大致相同。您也可能会放弃捕获项目编号,因为这只是结果列表中的索引,但这是细微差别。
您可能还有兴趣在最后一个变体中显示的
named tuple
中捕获它们。非常容易合作...
# Grocery Reader
import csv
from collections import namedtuple
with open("data.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
labels = next(reader, None) # capture the headers
result = []
for row in reader: # iterate the remaining rows
pairs = zip(labels, row)
result.append(dict(pairs))
print(result)
# the above isn't real satisfying as the numeric objects are captured as strings.
# so...
with open("data.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
labels = next(reader, None) # capture the headers
result = []
for row in reader: # iterate the remaining rows
row[0] = int(row[0])
row[1] = float(row[1])
row[2] = int(row[2])
pairs = zip(labels, row)
result.append(dict(pairs))
print(result)
with open("data.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
labels = next(reader, None) # capture the headers
# make lowercase...just for standardization
labels = [t.lower() for t in labels]
Grocery = namedtuple('Grocery', labels)
result = []
for row in reader: # iterate the remaining rows
row[0] = int(row[0])
row[1] = float(row[1])
row[2] = int(row[2])
grocery = Grocery(*row)
result.append(grocery)
for grocery in result:
# the below presumes you know the names inside the named tuple...
print(f'a {grocery.name} costs {grocery.price}')
[{'Item': '1', 'Price': '4.99', 'Qty': '2', 'Name': 'Muffin'}, {'Item': '2', 'Price': '1.25', 'Qty': '6', 'Name': 'Gum'}, {'Item': '3', 'Price': '2.50', 'Qty': '8', 'Name': 'Cookie'}]
[{'Item': 1, 'Price': 4.99, 'Qty': 2, 'Name': 'Muffin'}, {'Item': 2, 'Price': 1.25, 'Qty': 6, 'Name': 'Gum'}, {'Item': 3, 'Price': 2.5, 'Qty': 8, 'Name': 'Cookie'}]
a Muffin costs 4.99
a Gum costs 1.25
a Cookie costs 2.5
使用此处的
DictReader
csv。
cat food.csv
Item #,Price,Quantity,Name
1, 5.99$,1,Blueberry Muffin
import csv
with open('food.csv') as csv_file:
reader = csv.DictReader(csv_file,delimiter=",", quotechar='"')
for row in reader:
print(dict(row))
{'Item #': '1', 'Price': ' 5.99$', 'Quantity': '1', 'Name': 'Blueberry Muffin'}
csv.reader
next
作为标题行读取第一行results
数组,每行都有一个很好的键/值字典,并具有响应的标头名称。这就是我用的:
with open(args.filename) as csvfile:
headers = []
result = []
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
# Read headers from first row
headers = next(reader)
for row in reader: # iterate the remaining rows
parsed_row = {}
for header, value in zip(headers, row):
if value.isdigit(): # Check if value is a digit
parsed_row[header] = int(value) # Convert to integer if it's a digit
elif value.replace('.', '', 1).isdigit(): # Check if value is a float
parsed_row[header] = float(value) # Convert to float if it's a float
else:
parsed_row[header] = value # Otherwise, keep it as a string
result.append(parsed_row)