我是机器学习和python编程领域的新手,我想仅使用numpy从头开始实现算法,这是linnerud数据集上的感知器学习规则。
如果感知器不收敛,则需要运行1000次迭代,并且还需要使用所有3个属性以及仅chinups结果在Linnerrud数据集上测试此算法。
矢量需要被定义为对chinup的结果的二进制类,如下所示:
if(chinups>median(chinups)) then chinups=0 else chinups=1
我需要使用这些类(0/1)来训练感知器并建立概率表。最终,我需要感知器输出20个预测值,其中每个值都是加权和(感知器权重与属性值的点积)
我得到的似乎是不正确的。如果有人可以帮助我,那将是可观的。以下是我的代码:
import numpy as np
import matplotlib.pyplot as plot
import os
from numpy import arange
from sklearn.datasets import load_linnerud
def perceptron(linnerud):
pathName = os.path.dirname(os.path.abspath(__file__))
myfile = open(pathName+'\perceptron_results.txt', 'w')
data = linnerud['data']
target = linnerud['target']
chinup = []
for i in target:
chinup.append(i[0])
median = np.sum(chinup) / 20
#print(median)
binary = []
for j in chinup:
#print(j)
if j > median:
binary.append(0)
else:
binary.append(1)
weights = np.zeros([3,1])
iteration = 1000
for i in arange(0,iteration):
counter = 0
converged = True
for rowVal in data:
predVal = np.dot(rowVal,weights)
if predVal < 0:
predicted = 0
else:
predicted = 1
if predicted != binary[counter]:
converged = False
if binary[counter] == 0 :
weights = weights - np.expand_dims(rowVal,1)
else:
weights = weights + np.expand_dims(rowVal,1)
counter = counter + 1
if converged == True:
print("Error occurred")
break
finalPred = np.dot(data,weights)
myfile.write(str(finalPred)+'\n')
#print(finalPred)
print("\nProbability values appended in gnb_result,txt file")
plot.plot(finalPred,'bo');
plot.plot([0,20],[0,0])
plot.show()
My Output is this:
[[ 2845.]
[ -2316.]
[ -1906.]
[ 8874.]
[ 8926.]
[ 1693.]
[ 5421.]
[ 4877.]
[ 15905.]
[-14406.]
[ 13369.]
[ 2546.]
[ 5238.]
[ -4733.]
[ 3337.]
[ 954.]
[ 2243.]
[ 7887.]
[ 11835.]
[ 489.]]
在修改了代码中的一些部分后对我有用
import numpy as np
import matplotlib.pyplot as plot
import os
from numpy import arange
from sklearn.datasets import load_linnerud
dataN = dataset.get('data')
target = dataset.get('target')
chinup = []
for i in dataN:
chinup.append(i[0])
median = np.median(chinup)
binary = []
for j in chinup:
#print(j)
if j > median:
binary.append(0)
else:
binary.append(1)
weights = np.zeros([3,1])
iteration = 1000
for i in arange(0,iteration):
counter = 0
converged = True
for row_val in target:
pred_val = np.dot(row_val,weights)
if pred_val < 0:
predicted = 0
else:
predicted = 1
if predicted != binary[counter]:
converged = False
if binary[counter] == 0 :
weights = weights - np.expand_dims(row_val,1)
else:
weights = weights + np.expand_dims(row_val,1)
counter = counter + 1
if converged == True:
print("Loop broken")
break
final_pred = np.dot(target,weights)
plot.plot(final_pred,'bo');
plot.plot([0,20],[0,0])
plot.show()
对于目标中的我:chinup.append(i [0])中位数= np.median(chinup)这个中位数是176,这是错误的
应为:对于我的数据:chinup.append(i [0])中位数= np.median(chinup)这个中位数是11.5
在load_linnerud数据集中:'target_names':['Weight','Waist','Pulse']和'feature_names':['Chins','Situps','Jumps']如果您想获取“ if(chinups> median(chinups)),那么chinups = 0否则chinups = 1”,您应该使用“ for i in data:”
median = np.sum(chinup) / 20 --> 9.45 (it is mean)
median = np.median(chinup) -->11.5 (it is median)
中位数不是指,它们是不同的
然后绘制plot.plot([0,20],[0,0])
,为什么将线性回归放在这种情况下的这个位置?斜率为0?或者你只是用你的想法来画...