我无法确定在这个个人项目中潜水的地方,我希望这个社区可以帮助我创建一个Python脚本来处理这些数据。
我有一个CSV文件,其中包含动物救援喂养给狗的膳食清单,与狗窝编号相关联:
来源CSV - mealandtreats.csv
blank_column,Kennel_Number,Species,Food,Meal_ID
,1,Dog,Meal,11.2
,5,Dog,Meal,45.2
,3,Dog,Meal,21.4
,4,Dog,Meal,17
,2,Dog,Meal,11.2
,4,Dog,Meal,21.4
,6,Dog,Meal,17
,2,Dog,Meal,45.2
我有第二个CSV文件,它提供了一个键,用于将膳食映射到用餐时提供的食物:
用餐来治疗关键 - 用餐ToTreatsKey.csv
Meals_fed,Treats_fed
10.1,2.4
11.2,2.4
13.5,3
15.6,3.2
17,3.2
20.1,5.1
21.4,5.2
35.7,7.7
45.2,7.9
我需要采取从表1中提供的每种膳食类型(例如,删除重复条目),找到相关的处理类型,然后在每次将处理提供给特定狗舍时创建单独的条目。最终结果应如下所示:
结果CSV - mealandtreats.csv
blank_column,Kennel_Number,Species,Food,Meal_ID
,1,Dog,Meal,11.2
,5,Dog,Meal,45.2
,3,Dog,Meal,21.4
,4,Dog,Meal,17
,2,Dog,Meal,11.2
,4,Dog,Meal,21.4
,6,Dog,Meal,17
,2,Dog,Meal,45.2
,1,Dog,Treat,2.4
,5,Dog,Treat,7.9
,3,Dog,Treat,5.2
,4,Dog,Treat,3.2
,1,Dog,Treat,2.4
,4,Dog,Treat,5.2
宁愿使用csv模块而不是Pandas,但我愿意在必要时使用Pandas。
到目前为止,我有一些代码只是打开CSV,但我真的被困在下一步:
import csv
with open('./meals/results/foodToTreats.csv', 'r') as t1,
open('./results/food.csv', 'r') as t2:
key = t1.readlines()
map = t2.readlines()
with open('./results/food.csv', 'w') as outFileF:
for line in map:
if line not in key:
outFileF.write(line)
with open('./results/foodandtreats.csv', 'w') as outFileFT:
for line in map:
if line not in key:
outFileFT.write(line)
所以基本上我只需要在第二张纸上记下每个零食条目,在第一张纸上搜索匹配的相关食品条目,查找与该条目相关的狗窝编号,然后将其写入第一张纸。
用伪代码给出我最好的镜头,例如:
for x in column 0,y:
y,1 = Z
food = x
treat = y
kennel_number = z
when x,z:
writerows('', {'kennel_number"}, 'species', '{food/treat}',
{'meal_id"})
更新:这是我正在使用的确切代码,感谢@wwii。看到一个小错误:
import csv
import collections
treats = {}
with open('mealsToTreatsKey.csv') as f2:
for line in f2:
meal,treat = line.strip().split(',')
treats[meal] = treat
new_items = set()
Treat = collections.namedtuple('Treat', ['blank_column','Kennel_Number','Species','Food','Meal_ID'])
with open('foodandtreats.csv') as f1:
reader = csv.DictReader(f1)
for row in reader:
row['Food'] = 'Treat'
row['Meal_ID'] = treats[row['Meal_ID']]
new_items.add(Treat(**row))
fieldnames = reader.fieldnames
with open('foodandtreats.csv', 'a') as f1:
writer = csv.DictWriter(f1, fieldnames)
for row in new_items:
writer.writerow(row._asdict())
除了一个小bug之外,它的效果很好。写的第一个新行不是从它自己的行开始:enter image description here
制作一本字典,将膳食映射到对待
treats = {}
with open(treatfile) as f2:
for line in f2:
meal,treat = line.strip().split(',')
treats[meal] = treat
迭代膳食文件并创建一组新条目。使用namedtuples作为新项目。
import collections
new_items = set()
Treat = collections.namedtuple('Treat', ['blank_column','Kennel_Number','Species','Food','Meal_ID'])
with open(mealfile) as f1:
reader = csv.DictReader(f1)
for row in reader:
row['Food'] = 'Treat'
row['Meal_ID'] = treats[row['Meal_ID']]
new_items.add(Treat(**row))
fieldnames = reader.fieldnames
打开膳食文件(再次)以附加并写入新条目
with open(mealfile, 'a') as f1:
writer = csv.DictWriter(f1, fieldnames)
for row in new_items:
writer.writerow(row._asdict())
如果膳食文件没有以换行符结尾,则需要在写入新的treat
行之前添加一个。由于您可以控制文件,因此您应该确保它始终以空行结束。