我试图做到在Python一个看似简单的任务。现在,这是我第三次SO张贴在尽可能多天,我尴尬。
我希望能打开一个CSV文件。我想通过每一行循环,并为每行每列,如果该值不为0或1(这是一个“?”),基本上重写与最后列的值值。非常最后一列永远不会是0或1。它永远不会是“?”。我没有那么多关心print语句,因为我取代了“?” (或非0/1)值与在最后一列的值。
我重视我在这里工作的csv文件示例:http://www.sharecsv.com/s/7bef636c33054cae624928297146bae1/house.csv
如果您无法查看上面的链接,我已经粘贴了以下数据集的样本:
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,?,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,?,1.00
?,1.00,1.00,?,1.00,1.00,0.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,0.00,0.00,0.00
0.00,1.00,1.00,0.00,?,1.00,0.00,0.00,0.00,0.00,1.00,0.00,1.00,0.00,0.00,1.00,0.00
1.00,1.00,1.00,0.00,1.00,1.00,0.00,0.00,0.00,0.00,1.00,?,1.00,1.00,1.00,1.00,0.00
0.00,1.00,1.00,0.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,?,1.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,?,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,?,?,0.00
0.00,1.00,0.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,?,?,1.00,1.00,0.00,0.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,1.00,?,1.00,1.00,?,?,1.00
0.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,?,?,0.00
1.00,1.00,1.00,0.00,0.00,1.00,1.00,1.00,?,1.00,1.00,?,0.00,0.00,1.00,?,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,?,?,0.00,?,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,?,0.00,?,1.00
1.00,0.00,1.00,0.00,0.00,1.00,0.00,1.00,?,1.00,1.00,1.00,?,0.00,0.00,1.00,0.00
1.00,?,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,?,1.00,1.00,0.00,0.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,?,1.00,1.00,0.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,?,?,1.00,1.00,0.00
1.00,?,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,?,0.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,?,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,0.00,0.00,1.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,0.00,0.00,1.00,0.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,?,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,?,1.00,0.00,1.00,1.00
1.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,1.00,0.00,1.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,?,0.00,0.00,0.00,0.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,?,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,0.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,?,0.00,1.00,0.00,0.00,0.00,1.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,1.00,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,?,0.00,0.00,0.00,0.00,0.00,0.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,0.00,1.00,0.00
0.00,?,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,?,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,?,?,0.00
1.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,1.00,1.00,0.00,0.00,1.00,?,1.00,0.00,0.00,1.00,1.00,0.00,1.00,0.00,?,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,1.00,0.00,0.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,?,1.00
1.00,1.00,1.00,0.00,0.00,?,1.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,1.00,?,0.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,0.00,?,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,0.00,0.00,0.00,0.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,?,0.00,0.00,0.00,1.00,0.00
1.00,1.00,0.00,1.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,0.00,1.00,1.00
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,1.00
1.00,?,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,1.00,0.00
1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,0.00,0.00,1.00,1.00,0.00
1.00,0.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,0.00,1.00,?,0.00
1.00,1.00,1.00,1.00,0.00,0.00,1.00,1.00,1.00,1.00,1.00,0.00,0.00,1.00,0.00,1.00,1.00
这是我当前的代码:
import csv
reader = csv.reader(open('house1.csv'), delimiter = ',')
counter = 0
for row in reader:
# print("Opened Reader")
currVal = row[:-1]
counter = counter + 1
# print("set values")
for column in row:
questioncount = 0
# print("Looping columns")
if (column != 0 or column != 1):
questioncount = questioncount + 1
# This is where I should overwrite the value
print("Row " + str(counter) + " has " + str(questioncount) + " question marks ")
我不明白为什么我有这么大的困难。目前,Pycharm的输出,它说,每一个行和列有一个问号,这是不正确的。
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 1 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 2 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 3 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 4 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 5 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 6 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 7 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 8 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
Row 9 has 1 question marks
...
Row 435 has 1 question marks
我希望给这行
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,?,1.00,1.00,1.00,0.00,1.00,1.00
这个python脚本会变成该行成
0.00,1.00,0.00,1.00,1.00,1.00,0.00,0.00,0.00,1.00,1.00,1.00,1.00,1.00,0.00,1.00,1.00
任何意见,将不胜感激。
你的代码已经有两个错误。第一个是,你检查是否有column
不等于整数0
或1
,但column
包含一个字符串值。第二个,在questioncount
变量的初始化和print
函数的调用应该发生了for循环的。这里是工作的代码:
import csv
rows = []
with open('house.csv', 'r') as f:
reader = csv.reader(f, delimiter=',')
counter = 0
for row in reader:
counter = counter + 1
questioncount = 0
for i, column in enumerate(row):
if column == '?':
questioncount = questioncount + 1
row[i] = row[-1]
rows.append(row)
print('Row {i} has {q} question marks'.format(i=counter, q=questioncount))
with open('house1.csv', 'w') as f:
writer = csv.writer(f, delimiter=',')
for row in rows:
writer.writerow(row)
P.S:所以现在它保存另一个文件与被替换问号我已经更新的代码。
最快的修复您的问题:
if not (column == '1.00' or column == '0.00'):
你被检查,如果该字符串值(从CSV文件)“1.00”或“0.00”等于整数值(在你的if语句)1或0。
此外,你应该在申请“不是”两个检查一次,否则你的逻辑落空。
你的代码有一些问题:
if (column != 0)
将永远是正确的。print
语句在列循环,而不是该行循环。currVal
,但从来没有使用它。因为你的代码是寻找问号,也许你不应该检查列是否不为1或0,但无论是“?”代替?
您与这些补丁代码:
import csv
reader = csv.reader(open('house1.csv'), delimiter=',')
counter = 0
for row in reader:
counter = counter + 1
questioncount = 0
for column in row:
if column == '?':
questioncount += 1
print("Row " + str(counter) + " has " + str(questioncount) + " question marks ")
写你要找的输出:
import csv
reader = csv.reader(open('simple.csv'), delimiter=',')
writer = csv.writer(open('output.csv', 'w', newline=''), delimiter=',')
for row in reader:
writer.writerow([column if column != '?' else row[-1:] for column in row])
使用numpy的加载
import numpy as np
my_data = np.genfromtxt('house.csv', delimiter=',')
# print(my_data)
for i in range(len(my_data)):
row = my_data[i]
# print(row)
temp = row[-1]
# print(temp)
for j in range(len(row)):
column = row[j]
if not (column == 1 or column == 0) :
my_data[i,j] = temp
# print(my_data)
np.savetxt("house.csv", my_data, delimiter=",")